BGE-M3

Property	Value
Author	BAAI
License	MIT
Paper	View Paper
Dimension	1024
Max Sequence Length	8192 tokens

What is BGE-M3?

BGE-M3 is a groundbreaking embedding model that excels in three key areas: Multi-Functionality, Multi-Linguality, and Multi-Granularity. It represents a significant advancement in text embedding technology, capable of processing content across more than 100 languages while supporting multiple retrieval methods simultaneously.

Implementation Details

The model implements three distinct retrieval functionalities: dense retrieval for single vector embeddings, sparse retrieval for lexical matching, and multi-vector retrieval using ColBERT architecture. It's built on XLM-RoBERTa architecture with extended context length support up to 8192 tokens.

Dense retrieval generates single vector embeddings for efficient similarity search
Sparse retrieval provides token-level weights similar to BM25
Multi-vector retrieval enables fine-grained text matching using multiple vectors

Core Capabilities

Processes inputs from short sentences to long documents (up to 8192 tokens)
Supports 100+ languages with state-of-the-art performance
Unified architecture for multiple retrieval methods
Self-knowledge distillation for improved performance
Efficient batching for long text processing

Frequently Asked Questions

Q: What makes this model unique?

BGE-M3's uniqueness lies in its ability to combine three different retrieval methods (dense, sparse, and multi-vector) in a single model while supporting over 100 languages and handling long documents efficiently. This versatility makes it particularly valuable for RAG applications and cross-lingual information retrieval.

Q: What are the recommended use cases?

The model is ideal for building multilingual search systems, document retrieval applications, and RAG pipelines. It's particularly effective when used in hybrid retrieval setups combined with re-ranking, making it suitable for production-grade information retrieval systems.

bge-m3

BGE-M3

What is BGE-M3?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

The first platform built for prompt engineering

bge-m3

BGE-M3

What is BGE-M3?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models

The first platform built for prompt engineering