distilbert-base-nli-stsb-mean-tokens
Property | Value |
---|---|
Parameter Count | 66.4M |
License | Apache 2.0 |
Paper | Sentence-BERT Paper |
Downloads | 786,549 |
Architecture | DistilBERT with Mean Pooling |
What is distilbert-base-nli-stsb-mean-tokens?
This is a sentence transformer model designed to map sentences and paragraphs into a 768-dimensional dense vector space. Built on DistilBERT architecture, it's important to note that this model is now deprecated due to producing lower quality sentence embeddings compared to newer alternatives.
Implementation Details
The model utilizes a DistilBERT base architecture combined with mean pooling operation. It processes text through the transformer and applies mean pooling on the token embeddings, taking attention masks into account for accurate averaging. The model can be easily implemented using either the sentence-transformers library or HuggingFace Transformers.
- Maximum sequence length: 128 tokens
- Output dimension: 768
- Supports multiple frameworks: PyTorch, TensorFlow, ONNX
- Includes attention mask-aware mean pooling
Core Capabilities
- Sentence and paragraph embedding generation
- Semantic similarity computation
- Text clustering applications
- Semantic search functionality
Frequently Asked Questions
Q: What makes this model unique?
While this model pioneered sentence embedding using DistilBERT architecture, it's now considered deprecated. Its unique aspect was the combination of DistilBERT with mean token pooling, trained on NLI and STSB datasets.
Q: What are the recommended use cases?
Due to its deprecated status, it's recommended to use newer sentence embedding models available at SBERT.net. However, if used, it's suitable for basic sentence similarity tasks, clustering, and semantic search applications where state-of-the-art performance isn't critical.