Documentation

Getting Started

Proteus provides powerful image deduplication and provenance verification through perceptual hashing. Get started in minutes with our API or npm package.

Quick Start

npm install @proteus-labs/dinohash

import { hashImage } from '@proteus-labs/dinohash';

const hash = await hashImage(imageBlob);
console.log(hash);

How It Works

Proteus uses DinoHash, a perceptual hashing algorithm that generates robust fingerprints for images. These hashes remain consistent even when images are transformed (cropped, filtered, compressed, etc.), making them ideal for deduplication.

Key Feature: DinoHash achieves 12% higher bit accuracy than state-of-the-art methods and is robust to common image transformations.

API Reference

hashImage(image: Blob | File | ArrayBuffer)

Generate a perceptual hash for an image.

hashImage(imageBlob) → Promise<string>

compareHashes(hash1: string, hash2: string)

Compare two hashes and return similarity score (0-1).

compareHashes(hash1, hash2) → number

Use Cases

  • Image Deduplication: Remove near-duplicate images from datasets
  • Content Moderation: Detect previously flagged content
  • Provenance Tracking: Verify image origin and transformations
  • Copyright Protection: Identify unauthorized copies

Performance

Proteus offers multiple model sizes optimized for different use cases:

  • Fast: ResNet-based models for real-time processing
  • Balanced: Medium-sized models for batch processing
  • Robust: DinoV2-based models for maximum accuracy

Performance Trade-offs

We retrained the perceptual hashing head on multiple backbones of different models and hash sizes to trace the Pareto frontier between robustness and efficiency. This table shows evaluation time on 12 CPUs and different student-teacher distillations as backbones. Smaller backbones (e.g., Resnets) are faster and lighter for real-time use, while larger backbones (e.g., DinoV2) deliver higher robustness to edits at higher compute cost. The plot below summarizes the trade-off. Feel free to contact us if you want access to other ProteusHash models on the Pareto frontier.

Pareto frontier across backbones and hash sizes

Privacy & Security

Proteus can be deployed in multiple ways to protect your data:

  • API: Cloud-hosted service with encrypted requests
  • On-Prem: Self-hosted deployment for maximum privacy
  • MPC/FHE: Privacy-preserving queries using multi-party computation

Need Help?

Check out our API documentation or contact us for enterprise support and custom deployments.