F5-TTS-Vietnamese-100h

Maintained By
hynt

F5-TTS-Vietnamese-100h

PropertyValue
LicenseCC-BY-NC-SA-4.0
Authorhynt
Training Data150 hours Vietnamese speech
Base ModelF5-TTS_Base
Model URLhttps://huggingface.co/hynt/F5-TTS-Vietnamese-100h

What is F5-TTS-Vietnamese-100h?

F5-TTS-Vietnamese-100h is a specialized Text-to-Speech model fine-tuned specifically for Vietnamese language synthesis. Built upon the F5-TTS base architecture, this model has been trained on a diverse 150-hour dataset comprising VLSP collections (2021-2023), vietTTS, TeacherDinh-UEH, and curated YouTube content.

Implementation Details

The model was trained on an RTX 3090 GPU with a batch size of 3200 frames, reaching 390,000 training steps. The training data underwent rigorous preprocessing, including music background removal using Facebook's demucs model, length filtering (1-30 seconds), and text normalization.

  • Comprehensive data cleaning and preprocessing pipeline
  • Advanced audio background removal techniques
  • Optimized for production-quality speech synthesis
  • Institutional access only for research purposes

Core Capabilities

  • High-quality Vietnamese speech synthesis
  • Support for various text inputs with punctuation
  • Adjustable speech speed control
  • Integration with multiple vocoder options

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its extensive training on carefully curated Vietnamese speech data and its specific optimization for the Vietnamese language. The inclusion of diverse speech sources and rigorous preprocessing ensures high-quality output.

Q: What are the recommended use cases?

The model is specifically designed for research purposes in academic or institutional settings. It's ideal for Vietnamese TTS research, speech synthesis experiments, and academic studies in computational linguistics.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.