Grok Transcription Online —
Audio to Text with xAI STT

Upload any audio file and get an accurate transcript in seconds, powered by xAI's Grok speech-to-text model. Free to try, no account required.

Try Grok transcription free →

25+ languages

Grok STT automatically detects the language and delivers accurate results across major world languages, with seamless mid-stream language switching.

Timestamps & speakers

Get segment-level timestamps. Speaker labels included where detectable in the audio.

All common formats

MP3, WAV, FLAC, M4A, OGG, OPUS, WEBM, AAC, AIFF. Up to 100 MB per file.

No audio stored

Your file is deleted immediately after processing. We never keep a copy of your audio.

What is Grok STT?

Grok STT (speech-to-text) is a large-scale automatic speech recognition API developed by xAI, the AI lab founded by Elon Musk. Launched on April 18, 2026, it reports a 5.0% error rate on phone-call entity recognition (vs ElevenLabs 12.0%, Deepgram 13.5%, AssemblyAI 21.3%) and is built on the same audio stack powering Grok Voice, Tesla in-car voice and Starlink customer support. Read the full technical breakdown of Grok AI audio transcription or the xAI Grok STT & TTS API developer guide.

ScribeForge is the first dedicated web service to offer Grok transcription online — no need to build your own API integration, sign up for xAI access, or manage API keys. See the quick answer page for a 30-second overview.

How to transcribe audio with Grok online

  1. Go to the ScribeForge homepage.
  2. Drag your audio file (supported formats) onto the upload area or click to browse.
  3. Click Transcribe and wait a few seconds.
  4. Copy your transcript or download it as a .txt file.

You can try ScribeForge instantly with no account or credit card. Upload any audio file and get an instant 200-character preview, with 2 free previews per day.

Grok STT vs Whisper vs Deepgram

ModelSpeedPhone-call WER (xAI bench)LanguagesTimestampsPrice/hour
Grok STT (xAI)Fast5.0%25+Word-level$0.10 batch / $0.20 stream
ElevenLabs ScribeFast12.0%Yes~$0.40
Deepgram Nova-2Very fast13.5%35Word-level~$0.26
AssemblyAIFast21.3%Yes~$0.65
Whisper large-v3Mediumn/a (different bench)99Yes~$0.36

Grok STT leads xAI's official phone-call entity recognition benchmark and offers the lowest published price per hour. It performs particularly well on English conversational audio with background noise — making it a compelling choice for meeting recordings, podcasts, and voice memos. For full side-by-side benchmarks, read Grok STT vs Whisper vs Deepgram in 2026.

Frequently asked questions

Is Grok transcription free?

Yes — ScribeForge lets you try Grok transcription free with no credit card. You get an instant 200-character preview of any audio file. For full transcripts, buy a 50-credit pack ($9) or subscribe monthly ($19/mo).

What languages does Grok STT support?

Grok STT supports 25+ languages including English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese, Arabic, Hindi, Russian, Turkish, Polish, Dutch, Indonesian, Vietnamese, Swahili, Hebrew, Thai and more. Language is detected automatically and Grok handles seamless language switching mid-stream.

How long can my audio file be?

Files up to 100 MB are supported. For MP3 at 128kbps, that's roughly 100 minutes of audio. For lossless WAV, shorter. If your file is larger, compress it to MP3 first or split it first.

Is my audio private?

Yes. Your audio file is sent to xAI's API for processing and deleted from our servers immediately after transcription. We do not store or log the content of your audio. See our privacy policy.

Can I use this via API?

Yes — with a paid license key you can call POST /api/transcribe directly from your own application. See the API docs on the homepage, or read the full xAI Grok STT & TTS API guide for working Python code.

Learn more about Grok audio

Quick answer
Can Grok (xAI) transcribe audio?
Audio formats
Supported audio formats for Grok STT
Comparison
Grok STT vs Whisper vs Deepgram
Developer guide
xAI Grok STT & TTS API: Python guide