Embedding Models/mxbai-embed-large-v1

mxbai-embed-large-v1 by Mixedbread — 1024D

SOTA BERT-large sized embedding model. Outperforms OpenAI text-embedding-3-large and matches models 20x its size. Supports Matryoshka and binary quantization.

At a glance

Modalities

Dimensions

1K

Max tokens

512

Parameters

335M

Price / 1M tokens

Type

Dense

Matryoshka dimensions

641282565121024

Output types

Single VectorMulti Vector

Language support

en

Benchmarks

MTEB AVG 64.68
MTEB RETRIEVAL 54.39
MTEB STS 85.00

Details

Release date 2024-03-01
License Apache 2.0
Model ID mxbai-embed-large-v1
Provider Mixedbread

Tags

text-embeddingmatryoshkabinary-quantizationopen-source

What You Need to Know About mxbai-embed-large-v1

Complete Specifications for mxbai-embed-large-v1 by Mixedbread

Get detailed specifications for mxbai-embed-large-v1, including output dimensionality of 1024 dimensions, maximum input token length, supported input modalities, and pricing per million tokens. This embedding model is designed for semantic search, text classification, clustering, and retrieval-augmented generation applications where understanding the relationship between texts is essential.

Pricing and Cost Efficiency for mxbai-embed-large-v1

Review the pricing structure for mxbai-embed-large-v1 and compare it against other embedding models from Mixedbread and competitors. Understanding embedding model costs is essential when scaling vector search and RAG applications to millions of documents. We provide transparent pricing to help you budget effectively.

Use Cases and Applications for mxbai-embed-large-v1

Explore the ideal use cases for mxbai-embed-large-v1. Whether you are building a semantic search engine, recommendation system, document classification pipeline, or multilingual retrieval system, understanding this model's capabilities and dimensionality will help you choose the right embedding strategy for your project.