Brief Details: AltCLIP is a bilingual CLIP model supporting Chinese and English, trained on WuDao and LIAON datasets, offering superior performance in text-image retrieval tasks and stable diffusion capabilities.
Brief Details: 8B parameter Llama-3 model optimized with FP8 quantization, achieving 99.28% accuracy recovery vs original while halving memory requirements
Brief Details: A powerful 2B parameter text-to-image model using Next-DiT architecture with Gemma-2B text encoder, optimized through supervised fine-tuning for high-quality image generation.
Brief Details: A powerful ControlNet model for SDXL that generates Midjourney-quality images using edge detection, trained on 10M+ high-quality images.
Brief-details: A specialized ControlNet model for human pose detection and image generation, trained on OpenPose data with improved hand/face detection capabilities.
Brief Details: Helsinki-NLP's Spanish-to-Galician translation model, achieving 67.6 BLEU score. Uses transformer-align architecture with SentencePiece tokenization.
Brief-details: EVA-Qwen2.5-32B is a large language model with 32.8B parameters, offering multiple GGUF quantized versions for efficient deployment and various performance/size tradeoffs
Brief-details: BLIP vision-language model trained on COCO dataset for image-text matching, supporting both understanding and generation tasks with state-of-the-art performance.
Brief-details: Optimized SDXL-based text-to-image model focused on speed and quality, featuring specialized settings for fast inference (5-7 steps) with high-quality output.
Brief-details: A specialized LoRA model for minimalist logo design, built on FLUX.1-dev. Features unique trigger words and dual combination capabilities for creating professional logos.
Brief-details: SpaceLLaVA-lite is an enhanced spatial reasoning model built on MobileVLM, specialized in understanding object relationships in visual scenes through VQASynth techniques.
ALIGN-base: A dual-encoder vision-language model using EfficientNet and BERT, trained on COYO-700M dataset for zero-shot image classification and multi-modal embeddings
BRIEF DETAILS: A powerful 7B parameter Mistral-based model with enhanced function calling capabilities, achieving 90% accuracy in function calls and 84% in JSON outputs. Built for instruction-following and structured outputs.
Brief-details: Text-to-image model specializing in versatile artistic styles, particularly strong in detailed backgrounds and anime-style artwork with 16K+ downloads
Brief-details: A Helsinki-NLP translation model for Catalan to Italian conversion with strong BLEU score of 48.6 and chrF2 score of 0.69, built on transformer-align architecture
Brief Details: A versatile text-to-image model combining RadiantVibes and Paramount with Dreamlike_Diversions LoRA, specializing in photorealistic and fantasy imagery.
Brief Details: Enformer - A Transformer-based model for gene expression prediction from DNA sequences, developed by DeepMind and ported to PyTorch. CC-BY-4.0 licensed.
Brief-details: DeepSeek-V2-Lite-Chat is a 15.7B parameter MoE model with 2.4B active params, featuring Multi-head Latent Attention and efficient inference capabilities for deployment on single 40GB GPU.
Brief-details: LLaVA-v1.6-34b is a large-scale multimodal model with 34.8B parameters, capable of processing image-text tasks using the Nous-Hermes-2-Yi-34B base.
Brief-details: A specialized LoRA model for Stable Diffusion that creates realistic Polaroid-style photos, trained on FLUX.1-dev with CC BY-NC 4.0 license. Perfect for vintage-inspired imagery.
Brief-details: A sophisticated 70B parameter LLM optimized for roleplay and storytelling, featuring strong performance across multiple benchmarks with 67.11 avg score on OpenLLM leaderboard.