Google Cloud Speech API
Convert speech to text with Google AI-powered transcription
About Google Cloud Speech API
Google Cloud Speech API transforms audio into text using advanced AI models. It supports over 125 languages, real-time streaming, and enterprise-grade security. Customize transcription with domain-specific models, speaker diarization, and noise robustness for accurate results.
FAQ
Language is specified within a recognition request's languageCodes parameter. For more information, see the how-to guides about performing speech recognition.
Cloud Speech-to-Text offers multiple recognition models, each tuned to different audio types. Some languages are supported by additional models which are optimized for additional audio types, such as telephony_short and telephony.
Yes, if you're new to Google Cloud, you can create an account to evaluate how Cloud STT performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
The following language codes are officially maintained and monitored externally by Google. Using other language codes can result in breaking changes.
Supported features include Automatic punctuation, Speaker diarization, Model adaptation, Word-level confidence (Preview), Profanity filter, Spoken punctuation (Preview), and Spoken emoji (Preview).
You can filter by region, supported feature, or language. The available regions include asia-northeast1, asia-south1, asia-southeast1, eu, europe-west2, europe-west3, europe-west4, global, northamerica-northeast1, us, and us-central1.
Alternatives to consider
Community ratings & full listCategories
Claim this tool
Are you the founder? Claim your profile to update details and track views.
Claim tool