mt5-base-finetuned-fa

Property	Value
Base Model	google/mt5-base
Task	Text Summarization
Language	Farsi
Framework	PyTorch 1.11.0+cu113
Hugging Face Link	Model Repository

What is mt5-base-finetuned-fa?

mt5-base-finetuned-fa is a specialized text summarization model built upon Google's MT5-base architecture, specifically fine-tuned for Farsi language processing. The model demonstrates strong performance metrics, achieving a ROUGE-1 score of 33.7, ROUGE-2 of 21.28, and ROUGE-L of 31.69, along with a impressive BERTScore of 74.52.

Implementation Details

The model was trained using a carefully configured hyperparameter setup, including a learning rate of 0.0005, batch size of 32 (achieved through gradient accumulation), and Adam optimizer with betas=(0.9,0.999). The training process spanned 5 epochs with linear learning rate scheduling and 250 warmup steps.

Training utilized gradient accumulation steps of 8
Implemented label smoothing factor of 0.1
Achieved consistent generation length of 19.0 tokens
Progressive improvement in validation metrics across training epochs

Core Capabilities

Specialized in Farsi text summarization
Robust performance with ROUGE metrics
Consistent output generation length
Optimized for production deployment with PyTorch

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized fine-tuning on Farsi text summarization, achieving competitive ROUGE scores while maintaining consistent generation length. The careful hyperparameter optimization and training process demonstrate its reliability for production applications.

Q: What are the recommended use cases?

The model is best suited for Farsi text summarization tasks where consistent, high-quality summaries are required. Its strong BERTScore of 74.52 suggests good semantic understanding and generation capabilities.