mxbai-embed-large-v1 is a top English embed model by Mixedbread AI, great for RAG and more.
10K+
mxbai-embed-large-v1 is a state-of-the-art English language embedding model developed by Mixedbread AI. It converts text into dense vector representations, capturing the semantic essence of the input. Trained on a vast dataset exceeding 700 million pairs using contrastive training methods and fine-tuned on over 30 million high-quality triplets with the AnglE loss function, this model adapts to a wide range of topics and domains, making it suitable for various real-world applications and Retrieval-Augmented Generation (RAG) use cases.
mxbai-embed-large-v1 is designed for generating sentence embeddings suitable for various NLP applications.
| Attribute | Details |
|---|---|
| Provider | Mixedbread AI |
| Architecture | BERT |
| Cutoff Date | September 2023 |
| Languages | English |
| Tool Calling | ❌ |
| Input Modalities | Text |
| Output Modalities | Text embeddings |
| License | Apache 2.0 |
| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
|---|---|---|---|---|---|
ai/mxbai-embed-large:latestai/mxbai-embed-large:335M-F16 | 334.09 M | F16 | 512 tokens | 0.63 GiB | 638.85 MB |
ai/mxbai-embed-large:335M-F16 | 334.09 M | F16 | 512 tokens | 0.63 GiB | 638.85 MB |
¹: VRAM estimated based on model characteristics.
latest→335M-F16
First, pull the model:
docker model pull ai/mxbai-embed-large
Then run the model:
docker model run ai/mxbai-embed-large
For more information on Docker Model Runner, explore the documentation.
| Task Category | mxbai-embed-large-v1 |
|---|---|
| Avg (56 datasets) | 64.68 |
| Classification | 75.64 |
| Clustering | 46.71 |
| Pair Classification | 87.2 |
| Reranking | 60.11 |
| Retrieval | 54.39 |
| STS | 85.00 |
| Summarization | 32.71 |
Content type
Model
Digest
sha256:e5e025b14…
Size
639.5 MB
Last updated
about 1 year ago
docker model pull ai/mxbai-embed-largePulls:
101
Jun 1 to Jun 7