Definition
Speech-to-text (STT), also called automatic speech recognition, is the technology that converts spoken words into written text. In a voice agent, STT transcribes what the caller says so the system can understand and act on it.
Why it matters
STT accuracy directly affects whether the agent understands the caller, especially in noisy environments or with accents.
Put Voice AI to Work for Your Agency
Understanding the terminology is the first step. Launching a branded voice AI practice is the next. Fusion Calling helps agencies go live in about 7 days, with multi-provider support, done-with-you onboarding, and full brand ownership.
Explore the Partner Program→