What is Grok STT?
Grok STT (speech-to-text) is a large-scale automatic speech recognition API developed by xAI, the AI lab founded by Elon Musk. Launched on April 18, 2026, it reports a 5.0% error rate on phone-call entity recognition (vs ElevenLabs 12.0%, Deepgram 13.5%, AssemblyAI 21.3%) and is built on the same audio stack powering Grok Voice, Tesla in-car voice and Starlink customer support. Read the full technical breakdown of Grok AI audio transcription or the xAI Grok STT & TTS API developer guide.
ScribeForge is the first dedicated web service to offer Grok transcription online — no need to build your own API integration, sign up for xAI access, or manage API keys. See the quick answer page for a 30-second overview.
How to transcribe audio with Grok online
- Go to the ScribeForge homepage.
- Drag your audio file (supported formats) onto the upload area or click to browse.
- Click Transcribe and wait a few seconds.
- Copy your transcript or download it as a .txt file.
You can try ScribeForge instantly with no account or credit card. Upload any audio file and get an instant 200-character preview, with 2 free previews per day.
Grok STT vs Whisper vs Deepgram
| Model | Speed | Phone-call WER (xAI bench) | Languages | Timestamps | Price/hour |
|---|---|---|---|---|---|
| Grok STT (xAI) | Fast | 5.0% | 25+ | Word-level | $0.10 batch / $0.20 stream |
| ElevenLabs Scribe | Fast | 12.0% | — | Yes | ~$0.40 |
| Deepgram Nova-2 | Very fast | 13.5% | 35 | Word-level | ~$0.26 |
| AssemblyAI | Fast | 21.3% | — | Yes | ~$0.65 |
| Whisper large-v3 | Medium | n/a (different bench) | 99 | Yes | ~$0.36 |
Grok STT leads xAI's official phone-call entity recognition benchmark and offers the lowest published price per hour. It performs particularly well on English conversational audio with background noise — making it a compelling choice for meeting recordings, podcasts, and voice memos. For full side-by-side benchmarks, read Grok STT vs Whisper vs Deepgram in 2026.
Frequently asked questions
Is Grok transcription free?
Yes — ScribeForge lets you try Grok transcription free with no credit card. You get an instant 200-character preview of any audio file. For full transcripts, buy a 50-credit pack ($9) or subscribe monthly ($19/mo).
What languages does Grok STT support?
Grok STT supports 25+ languages including English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese, Arabic, Hindi, Russian, Turkish, Polish, Dutch, Indonesian, Vietnamese, Swahili, Hebrew, Thai and more. Language is detected automatically and Grok handles seamless language switching mid-stream.
How long can my audio file be?
Files up to 100 MB are supported. For MP3 at 128kbps, that's roughly 100 minutes of audio. For lossless WAV, shorter. If your file is larger, compress it to MP3 first or split it first.
Is my audio private?
Yes. Your audio file is sent to xAI's API for processing and deleted from our servers immediately after transcription. We do not store or log the content of your audio. See our privacy policy.
Can I use this via API?
Yes — with a paid license key you can call POST /api/transcribe directly from your own application. See the API docs on the homepage, or read the full xAI Grok STT & TTS API guide for working Python code.