all-MiniLM-L12-v2

Property	Value
Parameter Count	33.4M
License	Apache 2.0
Embedding Dimension	384
Training Data	1B+ sentence pairs

What is all-MiniLM-L12-v2?

all-MiniLM-L12-v2 is a powerful sentence embedding model that converts text into 384-dimensional vector representations. Built on the microsoft/MiniLM-L12-H384-uncased architecture, this model has been fine-tuned on over 1 billion sentence pairs across diverse datasets including Reddit comments, scientific papers, and question-answer pairs.

Implementation Details

The model utilizes a contrastive learning approach during fine-tuning, where it learns to identify true sentence pairs among randomly sampled alternatives. It processes input text up to 256 word pieces and employs mean pooling with attention masks for generating embeddings.

Built with PyTorch and compatible with ONNX, Rust, and OpenVINO
Trained on TPU v3-8 for 100k steps with batch size 1024
Uses AdamW optimizer with 2e-5 learning rate

Core Capabilities

Semantic search and information retrieval
Text clustering and classification
Sentence similarity computation
Cross-lingual text matching
Document embedding generation

Frequently Asked Questions

Q: What makes this model unique?

The model's strength lies in its extensive training on diverse datasets (21+) and efficient architecture that balances performance with model size. It provides high-quality embeddings while maintaining a relatively small footprint of 33.4M parameters.

Q: What are the recommended use cases?

The model excels in tasks requiring semantic understanding of text, including similarity search, clustering, and information retrieval. It's particularly effective for applications needing to compare or match text segments based on meaning rather than exact wording.

all-MiniLM-L12-v2

all-MiniLM-L12-v2

What is all-MiniLM-L12-v2?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models

The first platform built for prompt engineering