EXAONE-Deep-2.4B-AWQ

Maintained By
LGAI-EXAONE

EXAONE-Deep-2.4B-AWQ

PropertyValue
Parameter Count2.14B (without embeddings)
Context Length32,768 tokens
LicenseEXAONE AI Model License Agreement 1.1 - NC
QuantizationAWQ 4-bit group-wise weight-only (W4A16g128)
Architecture30 layers, GQA with 32 Q-heads and 8 KV-heads

What is EXAONE-Deep-2.4B-AWQ?

EXAONE-Deep-2.4B-AWQ is a sophisticated language model developed by LG AI Research, specifically optimized for complex reasoning tasks including mathematics and coding. This quantized version maintains high performance while reducing computational requirements through AWQ quantization.

Implementation Details

The model implements several advanced technical features that set it apart from comparable models:

  • Vocabulary size of 102,400 tokens
  • Grouped-Query Attention (GQA) architecture with 32 query heads and 8 KV heads
  • 4-bit quantization for efficient deployment
  • Tied word embeddings for parameter efficiency
  • Extended context length of 32,768 tokens

Core Capabilities

  • Superior performance in mathematical reasoning tasks
  • Advanced coding capabilities
  • Efficient processing of long-context inputs
  • Optimized for deployment across various frameworks (TensorRT-LLM, vLLM, SGLang)
  • Streaming inference support

Frequently Asked Questions

Q: What makes this model unique?

EXAONE-Deep-2.4B-AWQ stands out for its exceptional reasoning capabilities despite its relatively compact size, outperforming other models in its parameter range. The model's integration of GQA attention and extensive context window makes it particularly effective for complex reasoning tasks.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and general reasoning applications. It's particularly well-suited for scenarios requiring step-by-step reasoning and long-context understanding. The model performs best when prompts begin with thought processes and includes specific instructions for reasoning steps.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.