GEM v5.1 (ours)
mAP@20
Size
Size
About This Model
Overview
Nyris General Embedding Model (GEM) v5.1 is a generic, vision-only embedding model designed for broad instance-level image retrieval across industrial domains, including manufacturing, DIY, and retail. Unlike models tailored to a single vertical, GEM v5.1 is trained to handle a wide variety of product categories while retaining the ability to distinguish between visually similar instances.
Architecture
GEM v5.1 is based on the CLIP architecture and is evaluated as a vision-only image encoder that produces 768-dimensional global image embeddings. The model is engineered for efficient nearest-neighbor retrieval and is used in a pure image-to-image instance retrieval setting without any task-specific adaptation.
Capabilities
GEM v5.1 is intended for visual search scenarios with heterogeneous capture conditions, including variation in lighting, background clutter, viewpoint, and differences between catalog imagery and user-captured photos. In this benchmark, it serves as a general-purpose vision baseline for image-to-image product retrieval across multiple domains.
Performance Across Datasets
| Dataset | Category | R@1 | R@5 | mAP |
|---|---|---|---|---|
| Stanford Online Products | E-commerce | 86.87% | 94.17% | 72.45% |
| Products-10K | E-commerce | 77.08% | 91.06% | 66.27% |
| Clips-and-Connectors v1 | Industrial | 63.36% | 80.16% | 38.31% |
| DIY v1 | Hardware/DIY | 28.69% | 50.06% | 36.39% |
| Automotive v1 | Automotive | 28.18% | 46.31% | 31.00% |
| Average | 56.84% | 72.35% | 48.88% | |
ILIAS
Showing the results for ILIAS separately, as it is excluded from the average calculations. Because not all models are evaluated on ILIAS.
| Dataset | Category | R@1 | R@5 | mAP@20 | mAP@1000 |
|---|---|---|---|---|---|
| ILIAS | Instance Retrieval | 69.89% | 77.03% | 36.15% | 36.71% |