Text-to-Speech (TTS)
Dynamically generate speech during a call using plain text or SSML.
Lifelike Voices
Convert text to natural-sounding speech powered by advanced AI models.
Multi-Language
Support for over 30 languages and dialects, including local African accents.
SSML Support
Fine-tune pronunciation, pauses, and emphasis using SSML tags.
Using TTS in Call Instructions
When a call connects to your actionUrl, you return a JSON array of actions. To speak text, use the Say action.
Basic Text-to-Speech
JSON
[{"action": "Say","text": "Hello, thank you for calling Sendexa support.","voice": "en-US-Standard-C","language": "en-US"}]
Using SSML for Control
Wrap your text in <speak> tags to use Speech Synthesis Markup Language (SSML).
JSON
[{"action": "Say","text": "<speak>Welcome! <break time=\"1s\"/> Please enter your PIN.</speak>","voice": "en-GB-Standard-A"}]
Voice Customization
You can pass
"voice": "alice", "voice": "man", or specific provider IDs like "en-US-Journey-F" to control the exact sound.