Proteus

Next generation

content provenance

Proteus is an open-source platform for AI content provenance, leveraging perceptual hashing, digital signatures, and MPC/FHE to create incorruptible, private, and robust watermarks. The Proteus paper was presented at ICML 2025 at the CODEML Workshop. The Dinohash perceptual hashing algorithm can be used independently of the Proteus system.

Key Innovations

  • DinoHash: Perceptual hashing algorithm robust to common image transformations like filters, compression and crops. Algorithm achieves 12% higher bit accuracy than state-of-the-art methods.
  • Provenance Verification: Perceptual hash values are signed by the content generator, establishing provenance.
  • Privacy-Preserving Queries: Multi-Party Fully Homomorphic Encryption to map image provenance, keeps both user queries and registry data private, with a fallback to MPC if the database is too large.
  • Failsafe Detection: Backup classifier identifies synthetic images not found in the registry with state of the art accuracy, showing 25% better classification accuracy on real-world AI generators.
  • Adversarial Defense: DinoHash is adversarially trained against both hash collision and hash aversion attacks, that limit the attack surface wherein an attacker cannot modify the provenance without visually changing the image.

What are perceptual hashes?

Upload an original image and its edited version to compare their perceptual hashes, including our DinoHash algorithm and previous algorithms. This will help you understand how closely the hashes match, indicating the degree of similarity between the two images.

Upload original image

Drag and drop, or click to upload!

.jpeg/.png supported!

Upload modified image

Drag and drop, or click to upload!

.jpeg/.png supported!

Performance trade-offs

We retrained the perceptual hashing head on multiple backbones of different models and hash sizes to trace the Pareto frontier between robustness and efficiency. This table shows evaluation time on 12 CPUs and different student-teacher distillations as backbones. Smaller backbones (e.g., Resnets) are faster and lighter for real-time use, while larger backbones (e.g., DinoV2) deliver higher robustness to edits at higher compute cost. The plot below summarizes the trade-off. Feel free to contact us if you want access to other ProteusHash models on the Pareto frontier.

Pareto frontier across backbones and hash sizes

How does the full Proteus system work?