dolphin-base

Maintained By
DataoceanAI

Dolphin-base

PropertyValue
Model Size140M parameters
LicenseApache 2.0
AuthorDataoceanAI
ArchitectureJoint CTC-Attention with E-Branchformer
Training Data210,000+ hours

What is dolphin-base?

Dolphin-base is a sophisticated multilingual ASR (Automatic Speech Recognition) model developed through collaboration between DataoceanAI and Tsinghua University. It's designed specifically for Eastern languages, supporting an impressive array of 40 languages across East Asia, South Asia, Southeast Asia, and the Middle East, plus 22 Chinese dialects.

Implementation Details

The model implements a joint CTC-Attention architecture, utilizing an E-Branchformer encoder and a standard Transformer decoder. A notable innovation is its two-level language token system, which handles linguistic and regional diversity through separate language and region tokens (e.g., <zh> for language, <CN> for region).

  • Base model size: 140M parameters with 33.3% average WER
  • Trained on 210,000+ hours of proprietary and open-source data
  • Implements voice activity detection, segmentation, and language identification

Core Capabilities

  • Multilingual ASR across 40 Eastern languages
  • Support for 22 Chinese dialects
  • Voice activity detection (VAD)
  • Audio segmentation
  • Language identification (LID)
  • Regional accent handling through two-level token system

Frequently Asked Questions

Q: What makes this model unique?

The model's specialty lies in its comprehensive coverage of Eastern languages and Chinese dialects, combined with its innovative two-level language token system. This makes it particularly effective for handling diverse Asian language variations and accents.

Q: What are the recommended use cases?

The model is ideal for applications requiring Eastern language speech recognition, particularly in multilingual environments. It's suitable for voice transcription services, language learning platforms, and applications requiring language identification or voice activity detection in Asian languages.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.