Text-to-Speech Overview

Overview

The GistMag Text-to-Speech API converts text into natural-sounding speech using Google Cloud Text-to-Speech. It also provides Speech-to-Text transcription using OpenAI Whisper API. The API supports multiple languages, multiple voices, streaming, background music, and high-quality audio transcription.

Features

Basic TTS

Convert text to speech with a simple API call

Streaming

Stream audio in real-time as it’s generated

Batch Processing

Process long text in batches with pauses

With Music

Generate speech with background music

Speech-to-Text

Transcribe audio files to text

Change Speed

Adjust playback speed of audio files

Add Music

Add background music to existing audio

Voices

Browse and select from Google Cloud voices

Languages

List all supported languages

Supported Languages

The TTS API supports multiple languages, including:

English (en)
Spanish (es)
French (fr)
German (de)
Italian (it)
Portuguese (pt)
Japanese (ja)
Korean (ko)
Chinese (zh)

Audio Formats

Input: Text (plain string)
Output: WAV (uncompressed) or MP3 (compressed) audio files
Streaming: MP3 format for efficient streaming

Engines

Text-to-Speech

The API uses Google Cloud Text-to-Speech for TTS, which provides:

High-quality, natural-sounding speech
Multiple neural voices per language
Control over speaking rate, pitch, and volume
Fast, scalable generation with Google Cloud infrastructure

Speech-to-Text

The API uses OpenAI Whisper API for STT, which provides:

High-accuracy transcription with automatic language detection
Support for many audio formats (MP3, WAV, M4A, FLAC, OGG, etc.)
Multi-language support with automatic language detection
Robust handling of various accents and audio qualities

Quick Start

curl -X POST https://api.gistmag.co.uk/tts \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is a test of the text-to-speech API.",
    "language": "en",
    "api_key": "your_api_key_here"
  }'

The response will be an audio file that you can download or play directly.

Overview

Alt Text Generation

Text-to-Speech

Blog Meta Generator

Management

Text-to-Speech Overview

Overview

Features

Basic TTS

Streaming

Batch Processing

With Music

Speech-to-Text

Change Speed

Add Music

Voices

Languages

Supported Languages

Audio Formats

Engines

Text-to-Speech

Speech-to-Text

Quick Start

Overview

Alt Text Generation

Text-to-Speech

Blog Meta Generator

Management

​Overview

​Features

Basic TTS

Streaming

Batch Processing

With Music

Speech-to-Text

Change Speed

Add Music

Voices

Languages

​Supported Languages

​Audio Formats

​Engines

​Text-to-Speech

​Speech-to-Text

​Quick Start

Overview

Features

Supported Languages

Audio Formats

Engines

Text-to-Speech

Speech-to-Text

Quick Start