Voice AI

Custom Word Pronounciation for AI Voice (Brand Names)
Add support for a custom word pronunciation dictionary that allows users to define how specific words (such as brand names) should be pronounced by the AI voice agent, similar to the functionality offered by HeyGen and ElevenLabs. Problem Statement: Currently, voice agents rely on default Text-to-Speech (TTS) pronunciation rules. This causes incorrect pronunciation of brand names, proprietary terms, and industry-specific words. For example, our brand name Soccerfy must be pronounced “Soc-ker–fye”, but TTS engines often pronounce it incorrectly (e.g., “Soc-ker-fee”). This leads to: Brand inconsistency Reduced professionalism Confusion for callers Poor customer experience There is currently no reliable way to enforce consistent pronunciation across all voice interactions. Proposed Solution: Introduce a Custom Pronunciation Dictionary that allows users to: Define a custom word or phrase Specify the correct pronunciation using: - Phonetic spelling (e.g., “Soc-ker-fye”) - IPA or SSML phoneme notation (optional) Apply the pronunciation globally across: - Voice agents - Call scripts - Dynamic AI responses This dictionary would override the default TTS behavior whenever the custom word appears. Example Use Case: Custom Word: Soccerfy Display Text: Soccerfy Pronunciation: soc-ker-fye Whenever the AI agent speaks “Soccerfy,” it should consistently pronounce it as “Soc-ker-fye” without requiring manual script changes or workarounds. Comparable Implementations: HeyGen: Custom pronunciation entries for brand and proper nouns ElevenLabs: Pronunciation dictionaries with phoneme-level control These platforms demonstrate that this feature is technically feasible and significantly improves voice consistency. Business Impact: Improves brand trust and professionalism Ensures consistent pronunciation across all AI agents Reduces manual scripting workarounds Enhances enterprise readiness for branded voice deployments Critical for agencies and businesses using AI voice at scale Suggested Priority: High — essential for production-grade voice agents and branded AI deployments. Optional Enhancements (Nice to Have): Per-agent pronunciation dictionaries Per-language pronunciation rules UI preview / test pronunciation button Bulk upload of custom words
1
Load More