bge-base-en-v1.5

Maintained By
BAAI

BGE Base English v1.5

PropertyValue
Parameter Count109M
Model TypeFeature Extraction / Embedding
LicenseMIT
Primary LanguageEnglish

What is bge-base-en-v1.5?

BGE Base English v1.5 is a powerful embedding model developed by BAAI that achieves impressive performance on the MTEB benchmark. It's designed specifically for text embeddings and semantic search, featuring improvements in similarity distribution and enhanced retrieval capabilities compared to previous versions. With 109M parameters, it offers an excellent balance between model size and performance.

Implementation Details

The model uses a BERT-based architecture optimized for generating text embeddings. It supports multiple deployment options including FlagEmbedding, Sentence-Transformers, Langchain, and Hugging Face Transformers. A key feature is its ability to handle both short queries and long passages effectively, with optional query instructions for enhanced retrieval performance.

  • Normalized embeddings for cosine similarity computation
  • Support for both CPU and GPU inference
  • Maximum sequence length of 512 tokens
  • Optimized version 1.5 with improved similarity distribution

Core Capabilities

  • Strong performance on MTEB benchmark with 63.55 average score
  • Excellent retrieval capabilities (53.25 on retrieval tasks)
  • Robust clustering performance (45.77 score)
  • High accuracy on pair classification tasks (86.55)

Frequently Asked Questions

Q: What makes this model unique?

The model offers a superior balance between size and performance, with Version 1.5 specifically addressing similarity distribution issues and improving retrieval performance without requiring explicit instructions.

Q: What are the recommended use cases?

It's ideal for semantic search, document retrieval, text similarity comparison, and clustering applications. The model performs particularly well in short query to long passage retrieval scenarios.

The first platform built for prompt engineering