gemma-3-27b-it-int4-awq

Maintained By
gaunernst

Gemma 3 27B Instruction-Tuned INT4

PropertyValue
Original ModelGoogle Gemma 3 27B
QuantizationINT4
Context Window128K tokens
Training Tokens14 trillion
LicenseAvailable on Terms of Use

What is gemma-3-27b-it-int4-awq?

This model is a highly efficient INT4-quantized version of Google's Gemma 3 27B instruction-tuned model, converted to HF+AWQ format for easier deployment. It maintains the powerful capabilities of the original model while significantly reducing the memory footprint through INT4 quantization. The model supports both text and image inputs, making it a versatile option for multimodal applications.

Implementation Details

The model leverages quantization-aware training (QAT) for INT4 precision, converted from the original Flax checkpoint. It retains the architecture's 128K context window and supports over 140 languages. The implementation allows for efficient deployment on resource-constrained environments while maintaining high performance across various tasks.

  • Multimodal capabilities with support for text and image inputs (896x896 resolution)
  • Output generation up to 8192 tokens
  • Optimized for deployment on consumer hardware
  • Converted to HF+AWQ format for broader compatibility

Core Capabilities

  • Strong performance in reasoning and factuality tasks (85.6% on HellaSwag)
  • Robust STEM and code generation capabilities (82.6% on GSM8K)
  • Multilingual support with strong performance (74.3% on MGSM)
  • Advanced vision-language understanding (85.6% on DocVQA)

Frequently Asked Questions

Q: What makes this model unique?

The model combines the powerful capabilities of Gemma 3 with efficient INT4 quantization, making it possible to run a state-of-the-art multimodal model on consumer hardware while maintaining high performance across various benchmarks.

Q: What are the recommended use cases?

The model excels in content creation, chatbots, text summarization, image analysis, research applications, and educational tools. It's particularly well-suited for deployments where resource efficiency is crucial while maintaining high-quality outputs.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.