The Technical Architecture Behind PickItBox's DINOv2 Biometric Engine
A deep dive into how we use Hugging Face, DINOv2, and Supabase pgvector to execute sub-second similarity searches across millions of pets.
Posted by
Related reading
How Next.js and Supabase Power our Scalable Pet ID Platform
An overview of our reliable, enterprise-grade infrastructure utilizing Vercel and Supabase.
Privacy and Data Security in Biometric Pet Identification
Navigating enterprise compliance, GDPR, and the secure handling of vector embeddings in the cloud.
Processing Images at Scale
Biometric pet identification differs entirely from standard classification tasks. We aren't trying to determine "is this a dog?"—we need to determine "is this *specifically* Max the Golden Retriever?"
The DINOv2 Pipeline
We utilize an adaptation of Meta's DINOv2 (self-supervised vision transformer). When an image arrives at our endpoints via the Hugging Face Inference API, the model strips away background noise and extracts a highly dense 768-dimensional vector embedding representing the unique topological features of the animal's face or nose.
Vector Search with pgvector
Storing massive arrays of floating-point numbers is useless without rapid retrieval. PickItBox leverages PostgreSQL extended with pgvector. We utilize HNSW (Hierarchical Navigable Small World) indexing to perform approximate nearest neighbor searches.
-- Our core matching RPC function
CREATE OR REPLACE FUNCTION match_biometrics(
query_embedding vector(768),
match_threshold float,
match_count int
)
RETURNS TABLE (
pet_id uuid,
similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
biometrics.pet_id,
1 - (biometrics.embedding <=> query_embedding) AS similarity
FROM biometrics
WHERE 1 - (biometrics.embedding <=> query_embedding) > match_threshold
ORDER BY biometrics.embedding <=> query_embedding
LIMIT match_count;
END;
$$;