Pygmalion-2.7B
Property | Value |
---|---|
Base Model | GPT-Neo 2.7B |
License | CreativeML OpenRAIL-M |
Training Data Size | 56MB dialogue data |
Training Steps | ~5k steps on 4 NVIDIA A40s |
What is pygmalion-2.7b?
Pygmalion-2.7B is a sophisticated dialogue model that represents a significant advancement in conversational AI. Built upon EleutherAI's GPT-Neo architecture, this model has been specifically fine-tuned on a careful selection of dialogue data to enable natural, context-aware conversations with customizable character personas.
Implementation Details
The model was initialized from the uft-2.7b ConvoGPT model and further fine-tuned on approximately 48.5 million tokens. The training process utilized DeepSpeed technology across four NVIDIA A40 GPUs, incorporating both real and partially machine-generated conversations to ensure diverse dialogue capabilities.
- Custom persona framework for character-based interactions
- Specialized input formatting for optimal performance
- DeepSpeed optimization for efficient training
- Comprehensive dialogue history support
Core Capabilities
- Advanced text generation with context awareness
- Character persona maintenance throughout conversations
- Support for complex dialogue structures
- Flexible implementation through both UI and manual formatting
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its ability to maintain consistent character personas while generating contextually appropriate responses, supported by a specialized input format that includes character definitions and dialogue history.
Q: What are the recommended use cases?
The model is designed for advanced conversational applications, character-based interactions, and dialogue generation. However, it's important to note that the model is not suitable for minors due to potential X-rated content generation capabilities.