Pygmalion-2.7B

Property	Value
Base Model	GPT-Neo 2.7B
License	CreativeML OpenRAIL-M
Training Data Size	56MB dialogue data
Training Steps	~5k steps on 4 NVIDIA A40s

What is pygmalion-2.7b?

Pygmalion-2.7B is a sophisticated dialogue model that represents a significant advancement in conversational AI. Built upon EleutherAI's GPT-Neo architecture, this model has been specifically fine-tuned on a careful selection of dialogue data to enable natural, context-aware conversations with customizable character personas.

Implementation Details

The model was initialized from the uft-2.7b ConvoGPT model and further fine-tuned on approximately 48.5 million tokens. The training process utilized DeepSpeed technology across four NVIDIA A40 GPUs, incorporating both real and partially machine-generated conversations to ensure diverse dialogue capabilities.

Custom persona framework for character-based interactions
Specialized input formatting for optimal performance
DeepSpeed optimization for efficient training
Comprehensive dialogue history support

Core Capabilities

Advanced text generation with context awareness
Character persona maintenance throughout conversations
Support for complex dialogue structures
Flexible implementation through both UI and manual formatting

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to maintain consistent character personas while generating contextually appropriate responses, supported by a specialized input format that includes character definitions and dialogue history.

Q: What are the recommended use cases?

The model is designed for advanced conversational applications, character-based interactions, and dialogue generation. However, it's important to note that the model is not suitable for minors due to potential X-rated content generation capabilities.

pygmalion-2.7b