OpenAssistant-Llama2-13B-Orca-8K-3319-GGML

Maintained By
TheBloke

OpenAssistant-Llama2-13B-Orca-8K-3319-GGML

PropertyValue
Base ModelLLaMA 2 13B
Context Length8K tokens
LicenseLLaMA 2 Community License
PaperORCA Paper
QuantizationGGML (Multiple variants)

What is OpenAssistant-Llama2-13B-Orca-8K-3319-GGML?

This is a GGML-quantized version of the OpenAssistant LLaMA 2 13B model, fine-tuned on the Orca dataset with enhanced 8K context length support. The model uses RoPE scaling for improved long-context understanding and is optimized for both CPU and GPU inference.

Implementation Details

The model features multiple quantization variants ranging from 2-bit to 8-bit precision, offering different trade-offs between model size (5.74GB - 13.83GB) and performance. It implements the OpenAssistant conversation format and uses special tokens for system, prompter, and assistant roles.

  • Trained on Orca-Chat, RedPajama1T, and FanFics datasets
  • Uses linear scaling of RoPE embeddings for 8K context
  • Supports various quantization methods including q2_K through q8_0
  • Compatible with multiple inference frameworks including text-generation-webui and llama.cpp

Core Capabilities

  • Extended context handling up to 8K tokens
  • Efficient CPU/GPU inference through GGML quantization
  • Instruction following and chat functionality
  • Multiple quantization options for different hardware configurations

Frequently Asked Questions

Q: What makes this model unique?

The model combines the capabilities of LLaMA 2 with extended context length and efficient quantization, making it suitable for deployment on various hardware configurations while maintaining the ability to handle longer conversations.

Q: What are the recommended use cases?

The model is ideal for chatbots, content generation, and applications requiring longer context understanding. Different quantization versions allow deployment on hardware ranging from resource-constrained devices to high-performance systems.

The first platform built for prompt engineering