Embedding Models/Jina Embeddings v5 Text Small Text-Matching

Jina Embeddings v5 Text Small Text-Matching by Jina AI — 1024D

677M-parameter text-matching-targeted variant of v5-text-small. Optimized for symmetric pairwise similarity scoring, STS, paraphrase, and near-duplicate detection. 1024-dim embeddings with Matryoshka truncation. Supports 119+ languages up to 32K tokens. Available in GGUF, ONNX, and BF16 formats. Compatible with vLLM, TEI, llama.cpp, and sentence-transformers.

At a glance

Modalities

Dimensions

1K

Max tokens

33K

Parameters

677M

Price / 1M tokens

—

Type

Dense

Matryoshka dimensions

32641282565127681024

Output types

DenseLate Interaction

Language support

multilingual

Details

Release date	2026-02-18
License	CC BY-NC 4.0
Model ID	jina-embeddings-v5-text-small-text-matching
Provider	Jina AI

What You Need to Know About Jina Embeddings v5 Text Small Text-Matching

Complete Specifications for Jina Embeddings v5 Text Small Text-Matching by Jina AI

Get detailed specifications for Jina Embeddings v5 Text Small Text-Matching, including output dimensionality of 1024 dimensions, maximum input token length, supported input modalities, and pricing per million tokens. This embedding model is designed for semantic search, text classification, clustering, and retrieval-augmented generation applications where understanding the relationship between texts is essential.

Pricing and Cost Efficiency for Jina Embeddings v5 Text Small Text-Matching

Review the pricing structure for Jina Embeddings v5 Text Small Text-Matching and compare it against other embedding models from Jina AI and competitors. Understanding embedding model costs is essential when scaling vector search and RAG applications to millions of documents. We provide transparent pricing to help you budget effectively.

Use Cases and Applications for Jina Embeddings v5 Text Small Text-Matching

Explore the ideal use cases for Jina Embeddings v5 Text Small Text-Matching. Whether you are building a semantic search engine, recommendation system, document classification pipeline, or multilingual retrieval system, understanding this model's capabilities and dimensionality will help you choose the right embedding strategy for your project.