🎤 ChatterboxTTS - Dhivehi Text-to-Speech with Voice Cloning
Generate natural-sounding Dhivehi speech with voice cloning capabilities.
Apply Dhivehi text normalization before TTS generation
Quick Examples:
Reference Audio:
0 5
0.01 1
0 5
0 9999
Model
Select TTS model
Device
Select computation device
Examples
Click any example below to load pre-configured settings:
Preset Configurations
| Text to Convert | Reference Voice Audio (optional - for voice cloning) | Exaggeration | Temperature | CFG Weight | Seed | Device | Enable Text Normalization |
|---|
General Use (TTS and Voice Agents):
- The default settings (exaggeration=0.5, cfg=0.5) work well for most prompts.
- If the reference speaker has a fast speaking style, lowering cfg to around 0.3 can improve pacing.
Expressive or Dramatic Speech:
- Try lower cfg values (e.g. ~0.3) and increase exaggeration to around 0.7 or higher.
- Higher exaggeration tends to speed up speech; reducing cfg helps compensate with slower, more deliberate pacing.
Language Transfer Notes:
- Ensure that the reference clip matches the specified language tag. Otherwise, language transfer outputs may inherit the accent of the reference clip's language.
- To mitigate this, set the CFG weight to 0.
Additional Tips:
- For best voice cloning results, use clear audio with minimal background noise
- The reference audio should be 3-10 seconds long
- Use the same seed value for reproducible results