Brief-details: GatorTron-base: A 345M-parameter clinical language model trained on 82B+ words of medical data, developed by UF and NVIDIA for healthcare NLP tasks.
Brief-details: F5-TTS is a cutting-edge text-to-speech model using flow matching, focused on producing fluent and faithful speech, licensed under CC-BY-NC-4.0
BRIEF DETAILS: Deprecated sentence embedding model with 66.4M params that maps text to 768-dim vectors. Built on DistilBERT, but not recommended due to low quality.
Brief-details: Unsupervised dense information retrieval model by Facebook using contrastive learning, with 821K+ downloads and strong text embedding capabilities.
Brief Details: A multilingual sentiment analysis model supporting 12 languages, distilled from mDeBERTa-v3. 135M parameters, high accuracy (88.29% teacher agreement).
Brief-details: ResNet-18 A1 model with 11.7M parameters, trained on ImageNet-1k using LAMB optimizer and BCE loss. Achieves 71.49% top-1 accuracy.
Brief-details: Cutting-edge diffusion model for surface normals estimation, built on Stable Diffusion. Features LCM for fast processing and zero-shot capabilities.
Brief-details: LayoutLM base model (113M params) for document AI, combining text and layout pre-training. Microsoft-developed, MIT licensed, with 2.4M+ downloads.
Brief Details: A powerful forced alignment model supporting 158 languages, based on MMS-300M architecture with 315M parameters for precise audio-text synchronization.
BRIEF-DETAILS: Japanese BERT base model trained on Wikipedia, using IPA dictionary-based word tokenization. Features 12 layers, 768-dim hidden states, 32k vocab size.
Brief Details: Dutch speech recognition model based on XLSR-53, achieving 15.72% WER on Common Voice. Optimized for 16kHz audio with language model support.
Brief-details: T5-base is a versatile 223M parameter text-to-text transformer model capable of NLP tasks like translation, summarization, and question answering across multiple languages.
Brief-details: Specialized RoBERTa model fine-tuned for sentiment analysis of central bank communications, achieving 88% accuracy in classifying positive/negative sentiments.
Brief-details: State-of-the-art text embedding model with 434M parameters, supporting 8192 token context length and achieving 65.39 MTEB score. Built on transformer++ architecture.
Brief Details: Multilingual sentiment analysis model based on XLM-RoBERTa, achieving 69.3% accuracy across languages for tweet classification
Brief-details: SDXL 1.0 Base - Advanced text-to-image diffusion model from Stability AI. Features dual text encoders and improved generation quality over SD 1.5/2.1
Brief-details: CamemBERT is a powerful French language model based on RoBERTa architecture with 110M parameters, trained on OSCAR dataset for masked language modeling tasks.
Brief-details: Base-sized English embedding model (109M params) optimized for retrieval and semantic search, achieving strong MTEB benchmark performance across tasks like clustering and reranking.
Brief-details: DistilRoBERTa-base is a lightweight, distilled version of RoBERTa with 82.8M parameters, offering 2x faster performance while maintaining strong language understanding capabilities.
Brief-details: DistilGPT2 is a compressed version of GPT-2 with 82M parameters, trained via knowledge distillation for faster, lighter text generation while maintaining strong performance.
Brief Details: InfoXLM-Large: Microsoft's cross-lingual language model using information-theoretic framework. 2.8M+ downloads, popular for multilingual NLP tasks.