Skip to main content
GET
/
tts
/
voices
List TTS Voices
curl --request GET \
  --url https://api.example.com/tts/voices
{
  "voices": [
    {}
  ],
  "voices[].name": "<string>",
  "voices[].friendly_name": "<string>",
  "voices[].language_codes": [
    "<string>"
  ],
  "voices[].ssml_gender": "<string>",
  "voices[].natural_sample_rate_hertz": 123,
  "voices[].model": "<string>"
}

Overview

Retrieve the list of available Text-to-Speech voices provided by Google Cloud, so you can pick a specific voice_name when calling the /tts endpoint.

Query Parameters

language
string
Optional language filter (e.g. en, es, fr). When provided, only voices that support this language are returned.

Example Request

curl -X GET "https://api.gistmag.co.uk/tts/voices?language=en"

Response

voices
array
Array of voice objects
voices[].name
string
The unique voice identifier (use this value as voice_name in the /tts endpoint). This is the technical voice code (e.g., en-US-Neural2-F).
voices[].friendly_name
string
A human-readable name for the voice (e.g., US English Neural2 (Female)). Use this for display purposes in your UI.
voices[].language_codes
string[]
List of BCP-47 language codes supported by this voice (e.g. ["en-US"])
voices[].ssml_gender
string
Voice gender as defined by Google Cloud (MALE, FEMALE, NEUTRAL)
voices[].natural_sample_rate_hertz
number
Natural sample rate of the voice in Hz
voices[].model
string
The voice model type (e.g., Neural2, Neural, Studio, News, WaveNet, Standard). Neural2 and Neural models provide higher quality, more natural-sounding speech.

Example Response

{
  "voices": [
    {
      "name": "en-US-Neural2-F",
      "friendly_name": "US English Neural2 (Female)",
      "language_codes": ["en-US"],
      "ssml_gender": "FEMALE",
      "natural_sample_rate_hertz": 24000,
      "model": "Neural2"
    },
    {
      "name": "en-US-Wavenet-D",
      "friendly_name": "US English WaveNet (Male)",
      "language_codes": ["en-US"],
      "ssml_gender": "MALE",
      "natural_sample_rate_hertz": 24000,
      "model": "WaveNet"
    }
  ]
}

Understanding Voice Names

Voice names follow a pattern that encodes important information:

Voice Code Format

Voice codes follow the format: {language}-{locale}-{model}-{gender}
  • Language: Language code (e.g., en for English)
  • Locale: Regional variant (e.g., US, GB, AU for US English, British English, Australian English)
  • Model: Voice model type (e.g., Neural2, Neural, WaveNet, Standard)
  • Gender: F (Female), M (Male), or A (Any/Neutral)

Examples

  • en-US-Neural2-FUS English Neural2 (Female): High-quality neural voice with US accent, female gender
  • en-GB-Neural-DBritish English Neural (Male): Neural voice with British accent, male gender
  • en-AU-Standard-BAustralian English Standard (Any): Standard quality voice with Australian accent

Voice Properties

Locales indicate the regional accent:
  • US - United States English
  • GB - British English
  • AU - Australian English
  • CA - Canadian English
  • IN - Indian English
  • IE - Irish English
  • NZ - New Zealand English
  • ZA - South African English
Models indicate voice quality:
  • Neural2 - Latest high-quality neural voices (recommended)
  • Neural - High-quality neural voices
  • WaveNet - Advanced WaveNet voices
  • Studio - Professional studio-quality voices
  • News - News broadcaster style voices
  • Standard - Standard quality voices (faster, lower cost)
Note: Some voices (specifically en-IN-Chirp, en-IN-Neural2, en-GB-Neural2, en-GB-Chirp, en-AU-Neural2, and en-AU-Chirp) are not available as they are incompatible with the current API version.

Usage with /tts

To use a specific voice, pass its name as the voice_name field in the /tts request body.