Text-to-Speech (TTS)

Dynamically generate speech during a call using plain text or SSML.

Lifelike Voices

Convert text to natural-sounding speech powered by advanced AI models.

Multi-Language

Support for over 30 languages and dialects, including local African accents.

SSML Support

Fine-tune pronunciation, pauses, and emphasis using SSML tags.

Using TTS in Call Instructions

When a call connects to your actionUrl, you return a JSON array of actions. To speak text, use the Say action.

Basic Text-to-Speech

JSON

[
  {
    "action": "Say",
    "text": "Hello, thank you for calling Sendexa support.",
    "voice": "en-US-Standard-C",
    "language": "en-US"
  }
]

Using SSML for Control

Wrap your text in <speak> tags to use Speech Synthesis Markup Language (SSML).

JSON

[
  {
    "action": "Say",
    "text": "<speak>Welcome! <break time=\"1s\"/> Please enter your PIN.</speak>",
    "voice": "en-GB-Standard-A"
  }
]

Voice Customization

You can pass "voice": "alice", "voice": "man", or specific provider IDs like "en-US-Journey-F" to control the exact sound.