Overview
The GistMag Text-to-Speech API converts text into natural-sounding speech using advanced AI models. It supports multiple languages, voice cloning, streaming, and background music.Features
Basic TTS
Convert text to speech with a simple API call
Voice Cloning
Clone any voice from a reference audio sample
Streaming
Stream audio in real-time as it’s generated
Background Music
Add background music to your generated speech
Supported Languages
The TTS API supports 17+ languages including:- English (en)
- Spanish (es)
- French (fr)
- German (de)
- Italian (it)
- Portuguese (pt)
- Polish (pl)
- Turkish (tr)
- Russian (ru)
- Dutch (nl)
- Czech (cs)
- Arabic (ar)
- Chinese (zh-cn)
- Japanese (ja)
- Hungarian (hu)
- Korean (ko)
- Hindi (hi)
View All Languages
Get the complete list of supported languages
Audio Formats
- Input: Text (plain string)
- Output: WAV (uncompressed) or MP3 (compressed) audio files
- Streaming: MP3 format for efficient streaming
Model
The API uses XTTS-v2, a state-of-the-art multilingual text-to-speech model that provides:- High-quality, natural-sounding speech
- Voice cloning capabilities
- Support for multiple languages
- Fast generation times