Datasets Models Results
Datasets Clips-and-Connectors v1

Clips-and-Connectors v1

Industrial fastening and assembly solutions. Highly technical products with precise specifications.

Industrial Inter-Retrieval Closed-Set Gallery Private
3,624
Query Images
200,496
Gallery Images
453
Products (Query)
12,531
Products (Gallery)
Industry Partner
Source

About This Dataset

Overview

The Fasteners Dataset is one of the most challenging industrial benchmarks in our evaluation, designed specifically to test instance-level retrieval performance under extreme intra-class similarity and significant synthetic-real domain shift. It contains 12,531 unique fasteners, which includes screws, bolts, nuts, washers, pins, and similar components, organized into broad categories based on their type, mechanical function, and material.

Because many fasteners differ only in minute geometric or dimensional variations, this dataset represents a realistic and highly demanding use case for maintenance workflows.

Dataset Composition

The dataset is composed of two complementary subsets that form the reference and query splits used in our study.

Reference Split

The reference split consists of 200,496 synthetic CAD-rendered images, generated from the 3D models of all 12,531 fasteners. For each fastener, 16 viewpoints were rendered using a controlled multi-camera setup.

These synthetic renders provide clean, canonical representations of the parts, independent of background clutter or lighting variation. The use of CAD renders allows consistent and noise-free reference imagery.

Query Split

The query split comprises 3,624 real-world images captured for a curated subset of 453 fasteners. The number of query images per fastener varies, reflecting natural availability and usage frequency.

These images were taken under uncontrolled conditions, differing in: - Lighting - Perspective - Background - Object placement

This introduces substantial domain shift relative to the synthetic reference renders. Real-world capture conditions often include reflections, shadows, partial occlusions, or industrial workbench environments, all of which increase retrieval difficulty.

Dataset Statistics

No. of Reference Query
Images 200,496 3,624
Products 12,531 453
Views per Product 16

Model Performance on Clips-and-Connectors v1

Rank Model Provider Embedding
Size
Input
Size
R@1 R@5 mAP@20
1 GEM v5.1 (ours) nyris 768 336 63.36% 80.16% 38.31%
2 DINOv3 ViT-L/16 Meta 1024 224 26.38% 45.03% 5.94%
3 DINOv2 Large Meta 1024 224 13.66% 26.43% 2.71%
4 PE-Core L/14 Meta 1024 336 12.14% 26.85% 2.37%
5 SigLIP2 SO400M Google 1152 384 10.57% 23.12% 2.16%
6 Gemini Embedding 2 Google 3072 N/A 10.29% 23.12% 1.85%
7 Vertex AI Multi-Modal Google 1408 N/A 8.77% 21.80% 1.83%
8 Nomic Embed MM 3B Nomic AI 2048 Dynamic 2.46% 6.18% 0.43%
9 Cohere Embed v4 Cohere 1536 N/A 2.40% 6.68% 0.40%
10 Jina Embeddings v4 Jina AI 2048 Dynamic 1.71% 4.69% 0.30%

Sample Images

Curated query-reference pairs from this dataset. Each row shows a query image and its matching reference images.

Query Image
Reference Image
Full size image