GTE-Large-EN-V1.5
Property | Value |
---|---|
Parameter Count | 434M |
Model Type | Text Embeddings |
Context Length | 8192 tokens |
Embedding Dimension | 1024 |
License | Apache 2.0 |
Paper | mGTE Paper |
What is gte-large-en-v1.5?
GTE-Large-EN-V1.5 is a state-of-the-art text embedding model developed by Alibaba Group's Institute for Intelligent Computing. It represents a significant advancement in text representation, supporting an extended context length of 8192 tokens while achieving superior performance on the MTEB benchmark. The model is built on the transformer++ encoder backbone, combining BERT architecture with RoPE and GLU components.
Implementation Details
The model underwent a sophisticated multi-stage training process, including masked language modeling (MLM) on c4-en, weak-supervised contrastive pre-training, and supervised contrastive fine-tuning. The training procedure was carefully orchestrated across different sequence lengths (512, 2048, and 8192) with optimized learning rates and batch sizes.
- Achieves 65.39 average score on MTEB benchmark
- Supports context length up to 8192 tokens
- Uses transformer++ architecture with RoPE and GLU
- Implements advanced multi-stage training strategy
Core Capabilities
- High-quality text embeddings for similarity tasks
- Strong performance across classification, clustering, and retrieval tasks
- Efficient handling of long documents
- State-of-the-art results on long-context retrieval tests
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to handle 8192 tokens while maintaining state-of-the-art performance sets it apart. Its transformer++ architecture and sophisticated training procedure enable superior text representation capabilities.
Q: What are the recommended use cases?
The model excels in text similarity tasks, document retrieval, semantic search, and classification tasks. It's particularly effective for applications requiring long document understanding and comparison.