EXAONE-Deep-7.8B-AWQ

Maintained By
LGAI-EXAONE

EXAONE-Deep-7.8B-AWQ

PropertyValue
Parameters7.8B (6.98B without embeddings)
Context Length32,768 tokens
QuantizationAWQ 4-bit group-wise (W4A16g128)
LicenseEXAONE AI Model License Agreement 1.1 - NC
AuthorLG AI Research

What is EXAONE-Deep-7.8B-AWQ?

EXAONE-Deep-7.8B-AWQ is an advanced language model developed by LG AI Research, specifically optimized for complex reasoning tasks including mathematics and coding. This quantized version maintains high performance while reducing computational requirements through 4-bit AWQ quantization. The model features 32 layers with a unique attention structure using 32 Q-heads and 8 KV-heads through Group Query Attention (GQA).

Implementation Details

The model employs sophisticated architectural choices including a vocabulary size of 102,400 tokens and group-wise weight quantization. It's designed for efficient deployment while maintaining performance comparable to larger models, even outperforming OpenAI's o1-mini in benchmark tests.

  • 32-layer architecture with GQA attention mechanism
  • 4-bit quantization for efficient deployment
  • 32K token context window
  • Optimized for reasoning and mathematical tasks

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Code generation and analysis
  • Long-context understanding (32K tokens)
  • Efficient inference with quantized weights
  • Support for multiple deployment frameworks (TensorRT-LLM, vLLM, SGLang)

Frequently Asked Questions

Q: What makes this model unique?

The model combines high-performance reasoning capabilities with efficient quantization, making it particularly strong in mathematical and coding tasks while being deployable in resource-constrained environments. Its 32K context window and GQA attention mechanism set it apart from similar-sized models.

Q: What are the recommended use cases?

The model excels in scenarios requiring complex reasoning, particularly in mathematics and coding. It's optimized for step-by-step problem solving and can handle both English and Korean language tasks. Best results are achieved when prompts begin with proper reasoning steps and include specific instructions for output formatting.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.