Skip to main content

Voice Language

Voice & Language

Summary: Configure Text-to-Speech voices, languages, and pronunciation to create natural-sounding agents.

Voice Configuration Overview

Language Configuration

ParameterDescriptionExample
nameHuman-readable name"English"
codeLanguage code for STT"en-US"
voiceTTS voice identifier"rime.spore" or "elevenlabs.josh:eleven_turbo_v2_5"

Fillers (Natural Speech)

ParameterDescriptionExample
speech_fillersUsed during natural conversation pauses["Um", "Well", "So"]
function_fillersUsed while executing a function["Let me check...", "One moment..."]

Adding a Language

Basic Configuration

from signalwire_agents import AgentBase


class MyAgent(AgentBase):
def __init__(self):
super().__init__(name="my-agent")

# Basic language setup
self.add_language(
name="English", # Display name
code="en-US", # Language code for STT
voice="rime.spore" # TTS voice
)

Voice Format

The voice parameter uses the format engine.voice:model where model is optional:

## Simple voice (engine.voice)
self.add_language("English", "en-US", "rime.spore")

## With model (engine.voice:model)
self.add_language("English", "en-US", "elevenlabs.josh:eleven_turbo_v2_5")

Available TTS Engines

ProviderEngine CodeExample VoiceReference
Amazon Pollyamazonamazon.Joanna-NeuralVoice IDs
Cartesiacartesiacartesia.a167e0f3-df7e-4d52-a9c3-f949145efdabVoice IDs
Deepgramdeepgramdeepgram.aura-asteria-enVoice IDs
ElevenLabselevenlabselevenlabs.thomasVoice IDs
Google Cloudgcloudgcloud.en-US-Casual-KVoice IDs
Microsoft Azureazureazure.en-US-AvaNeuralVoice IDs
OpenAIopenaiopenai.alloyVoice IDs
Rimerimerime.luna:arcanaVoice IDs

Filler Phrases

Add natural pauses and filler words:

self.add_language(
name="English",
code="en-US",
voice="rime.spore",
speech_fillers=[
"Um",
"Well",
"Let me think",
"So"
],
function_fillers=[
"Let me check that for you",
"One moment please",
"I'm looking that up now",
"Bear with me"
]
)

Speech fillers: Used during natural conversation pauses

Function fillers: Used while the AI is executing a function

Multi-Language Support

Use code="multi" for automatic language detection and matching:

class MultilingualAgent(AgentBase):
def __init__(self):
super().__init__(name="multilingual-agent")

# Multi-language support (auto-detects and matches caller's language)
self.add_language(
name="Multilingual",
code="multi",
voice="rime.spore"
)

self.prompt_add_section(
"Language",
"Automatically detect and match the caller's language without "
"prompting or asking them to verify. Respond naturally in whatever "
"language they speak."
)

The multi code supports: English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch.

Note: Speech recognition hints do not work when using code="multi". If you need hints for specific terms, use individual language codes instead.

For more control over individual languages with custom fillers:

class CustomMultilingualAgent(AgentBase):
def __init__(self):
super().__init__(name="custom-multilingual")

# English (primary)
self.add_language(
name="English",
code="en-US",
voice="rime.spore",
speech_fillers=["Um", "Well", "So"],
function_fillers=["Let me check that"]
)

# Spanish
self.add_language(
name="Spanish",
code="es-MX",
voice="rime.luna",
speech_fillers=["Eh", "Pues", "Bueno"],
function_fillers=["Dejame verificar", "Un momento"]
)

# French
self.add_language(
name="French",
code="fr-FR",
voice="rime.claire",
speech_fillers=["Euh", "Alors", "Bon"],
function_fillers=["Laissez-moi verifier", "Un instant"]
)

self.prompt_add_section(
"Language",
"Automatically detect and match the caller's language without "
"prompting or asking them to verify."
)

Pronunciation Rules

Fix pronunciation of specific words:

class AgentWithPronunciation(AgentBase):
def __init__(self):
super().__init__(name="pronunciation-agent")
self.add_language("English", "en-US", "rime.spore")

# Fix brand names
self.add_pronunciation(
replace="ACME",
with_text="Ack-me"
)

# Fix technical terms
self.add_pronunciation(
replace="SQL",
with_text="sequel"
)

# Case-insensitive matching
self.add_pronunciation(
replace="api",
with_text="A P I",
ignore_case=True
)

# Fix names
self.add_pronunciation(
replace="Nguyen",
with_text="win"
)

Set Multiple Pronunciations

## Set all pronunciations at once
self.set_pronunciations([
{"replace": "ACME", "with": "Ack-me"},
{"replace": "SQL", "with": "sequel"},
{"replace": "API", "with": "A P I", "ignore_case": True},
{"replace": "CEO", "with": "C E O"},
{"replace": "ASAP", "with": "A sap"}
])

Voice Selection Guide

Choosing the right TTS engine and voice significantly impacts caller experience. Consider these factors:

Use Case Recommendations

Use CaseRecommended Voice Style
Customer ServiceWarm, friendly (rime.spore)
Technical SupportClear, professional (rime.marsh)
SalesEnergetic, persuasive (elevenlabs voices)
HealthcareCalm, reassuring
Legal/FinanceFormal, authoritative

TTS Engine Comparison

EngineLatencyQualityCostBest For
RimeVery fastGoodLowProduction, low-latency needs
ElevenLabsMediumExcellentHigherPremium experiences, emotion
Google CloudMediumVery goodMediumMultilingual, SSML features
Amazon PollyFastGoodLowAWS integration, Neural voices
OpenAIMediumExcellentMediumNatural conversation style
AzureMediumVery goodMediumMicrosoft ecosystem
DeepgramFastGoodMediumSpeech-focused applications
CartesiaFastGoodMediumSpecialized voices

Choosing an Engine

Prioritize latency (Rime, Polly, Deepgram):

  • Interactive conversations where quick response matters
  • High-volume production systems
  • Cost-sensitive deployments

Prioritize quality (ElevenLabs, OpenAI):

  • Premium customer experiences
  • Brand-sensitive applications
  • When voice quality directly impacts business outcomes

Prioritize features (Google Cloud, Azure):

  • Need SSML for fine-grained control
  • Complex multilingual requirements
  • Specific enterprise integrations

Testing and Evaluation Process

Before selecting a voice for production:

  1. Create test content with domain-specific terms, company names, and typical phrases
  2. Test multiple candidates from your shortlisted engines
  3. Evaluate each voice:
    • Pronunciation accuracy (especially brand names)
    • Natural pacing and rhythm
    • Emotional appropriateness
    • Handling of numbers, dates, prices
  4. Test with real users if possible—internal team members or beta callers
  5. Measure latency in your deployment environment

Voice Personality Considerations

Match voice to brand:

  • Formal brands → authoritative, measured voices
  • Friendly brands → warm, conversational voices
  • Tech brands → clear, modern-sounding voices

Consider your audience:

  • Older demographics may prefer clearer, slower voices
  • Technical audiences tolerate more complex terminology
  • Regional preferences may favor certain accents

Test edge cases:

  • Long monologues (product descriptions)
  • Lists and numbers (order details, account numbers)
  • Emotional content (apologies, celebrations)

Dynamic Voice Selection

Change voice based on context:

class DynamicVoiceAgent(AgentBase):
DEPARTMENT_VOICES = {
"support": {"voice": "rime.spore", "name": "Alex"},
"sales": {"voice": "rime.marsh", "name": "Jordan"},
"billing": {"voice": "rime.coral", "name": "Morgan"}
}

def __init__(self):
super().__init__(name="dynamic-voice")

def on_swml_request(self, request_data=None, callback_path=None, request=None):
# Determine department from called number
call_data = (request_data or {}).get("call", {})
called_num = call_data.get("to", "")

if "555-1000" in called_num:
dept = "support"
elif "555-2000" in called_num:
dept = "sales"
else:
dept = "billing"

config = self.DEPARTMENT_VOICES[dept]

self.add_language("English", "en-US", config["voice"])

self.prompt_add_section(
"Role",
f"You are {config['name']}, a {dept} representative."
)

Language Codes Reference

Supported language codes:

LanguageCodes
Multilingualmulti (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch)
Bulgarianbg
Czechcs
Danishda, da-DK
Dutchnl
Englishen, en-US, en-AU, en-GB, en-IN, en-NZ
Finnishfi
Frenchfr, fr-CA
Germande
Hindihi
Hungarianhu
Indonesianid
Italianit
Japaneseja
Koreanko, ko-KR
Norwegianno
Polishpl
Portuguesept, pt-BR, pt-PT
Russianru
Spanishes, es-419
Swedishsv, sv-SE
Turkishtr
Ukrainianuk
Vietnamesevi

Complete Voice Configuration Example

from signalwire_agents import AgentBase


class FullyConfiguredVoiceAgent(AgentBase):
def __init__(self):
super().__init__(name="voice-configured")

# Primary language with all options
self.add_language(
name="English",
code="en-US",
voice="rime.spore",
speech_fillers=[
"Um",
"Well",
"Let me see",
"So"
],
function_fillers=[
"Let me look that up for you",
"One moment while I check",
"I'm searching for that now",
"Just a second"
]
)

# Secondary language
self.add_language(
name="Spanish",
code="es-MX",
voice="rime.luna",
speech_fillers=["Pues", "Bueno"],
function_fillers=["Un momento", "Dejame ver"]
)

# Pronunciation fixes
self.set_pronunciations([
{"replace": "ACME", "with": "Ack-me"},
{"replace": "www", "with": "dub dub dub"},
{"replace": ".com", "with": "dot com"},
{"replace": "@", "with": "at"}
])

self.prompt_add_section(
"Role",
"You are a friendly customer service agent."
)