mt5-base-finetuned-fa

Maintained By
ahmeddbahaa

mt5-base-finetuned-fa

PropertyValue
Base Modelgoogle/mt5-base
TaskText Summarization
LanguageFarsi
FrameworkPyTorch 1.11.0+cu113
Hugging Face LinkModel Repository

What is mt5-base-finetuned-fa?

mt5-base-finetuned-fa is a specialized text summarization model built upon Google's MT5-base architecture, specifically fine-tuned for Farsi language processing. The model demonstrates strong performance metrics, achieving a ROUGE-1 score of 33.7, ROUGE-2 of 21.28, and ROUGE-L of 31.69, along with a impressive BERTScore of 74.52.

Implementation Details

The model was trained using a carefully configured hyperparameter setup, including a learning rate of 0.0005, batch size of 32 (achieved through gradient accumulation), and Adam optimizer with betas=(0.9,0.999). The training process spanned 5 epochs with linear learning rate scheduling and 250 warmup steps.

  • Training utilized gradient accumulation steps of 8
  • Implemented label smoothing factor of 0.1
  • Achieved consistent generation length of 19.0 tokens
  • Progressive improvement in validation metrics across training epochs

Core Capabilities

  • Specialized in Farsi text summarization
  • Robust performance with ROUGE metrics
  • Consistent output generation length
  • Optimized for production deployment with PyTorch

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized fine-tuning on Farsi text summarization, achieving competitive ROUGE scores while maintaining consistent generation length. The careful hyperparameter optimization and training process demonstrate its reliability for production applications.

Q: What are the recommended use cases?

The model is best suited for Farsi text summarization tasks where consistent, high-quality summaries are required. Its strong BERTScore of 74.52 suggests good semantic understanding and generation capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.