OpenHands LM 7B GGUF Quantizations

Property	Value
Original Model	OpenHands LM 7B v0.1
Quantization Author	bartowski
Size Range	2.78GB - 15.24GB
Format	GGUF (llama.cpp compatible)

What is all-hands_openhands-lm-7b-v0.1-GGUF?

This is a comprehensive collection of GGUF quantizations for the OpenHands LM 7B model, offering various compression levels using advanced quantization techniques. The collection provides options ranging from high-precision BF16 format to highly compressed IQ2_M variants, allowing users to balance between model size and performance based on their hardware constraints.

Implementation Details

The quantizations were created using llama.cpp release b5010 with imatrix optimization. The collection features multiple quantization methods including standard K-quants (Q2-Q8) and newer I-quants (IQ2-IQ4), with special variants utilizing Q8_0 for embedding and output weights to maintain quality in critical model components.

Includes 24 different quantization variants
Implements online repacking for ARM and AVX CPU optimization
Supports both standard and advanced quantization techniques
Uses specific prompt format with im_start/im_end tokens

Core Capabilities

Multiple compression levels suitable for different hardware configurations
Optimized performance on both CPU and GPU systems
Special quantizations for enhanced ARM/AVX performance
Compatible with LM Studio and any llama.cpp-based project

Frequently Asked Questions

Q: What makes this model unique?

This collection stands out for its comprehensive range of quantization options, from high-quality Q8_0 to highly compressed IQ2_M formats, allowing users to find the perfect balance between model size and performance for their specific needs. The implementation of both K-quants and I-quants, along with special handling of embedding weights, provides optimal performance across different hardware configurations.

Q: What are the recommended use cases?

For most users, the Q4_K_M variant (4.68GB) is recommended as a balanced option. Users with limited RAM should consider IQ3_XXS (3.11GB) or IQ2_M (2.78GB). For maximum quality, the Q6_K_L variant (6.52GB) is recommended. GPU users should choose a variant with a file size 1-2GB smaller than their available VRAM.

all-hands_openhands-lm-7b-v0.1-GGUF