distilbert-base-nli-stsb-mean-tokens

Property	Value
Parameter Count	66.4M
License	Apache 2.0
Paper	Sentence-BERT Paper
Downloads	786,549
Architecture	DistilBERT with Mean Pooling

What is distilbert-base-nli-stsb-mean-tokens?

This is a sentence transformer model designed to map sentences and paragraphs into a 768-dimensional dense vector space. Built on DistilBERT architecture, it's important to note that this model is now deprecated due to producing lower quality sentence embeddings compared to newer alternatives.

Implementation Details

The model utilizes a DistilBERT base architecture combined with mean pooling operation. It processes text through the transformer and applies mean pooling on the token embeddings, taking attention masks into account for accurate averaging. The model can be easily implemented using either the sentence-transformers library or HuggingFace Transformers.

Maximum sequence length: 128 tokens
Output dimension: 768
Supports multiple frameworks: PyTorch, TensorFlow, ONNX
Includes attention mask-aware mean pooling

Core Capabilities

Sentence and paragraph embedding generation
Semantic similarity computation
Text clustering applications
Semantic search functionality

Frequently Asked Questions

Q: What makes this model unique?

While this model pioneered sentence embedding using DistilBERT architecture, it's now considered deprecated. Its unique aspect was the combination of DistilBERT with mean token pooling, trained on NLI and STSB datasets.

Q: What are the recommended use cases?

Due to its deprecated status, it's recommended to use newer sentence embedding models available at SBERT.net. However, if used, it's suitable for basic sentence similarity tasks, clustering, and semantic search applications where state-of-the-art performance isn't critical.