Grok STT Online

Free Grok speech-to-text in your browser — no xAI account, no API key, no code. Powered by xAI scribe_v2.

Open Grok STT tool → No account  ·  No API key  ·  2 free uses/day
Model
xAI scribe_v2
Languages
25+
Formats
12 (MP3, WAV, M4A…)
Max file size
25 MB
Timestamps
Phrase-level
Speaker labels
Auto-detect
Free tier
2/day per IP
xAI WER (phone)
5.0%

What is Grok STT?

Grok Speech-to-Text is xAI's audio transcription API, built on the same voice stack that powers Grok Voice, Tesla in-car voice, and Starlink customer support. It was released on April 18, 2026 as a standalone API at POST https://api.x.ai/v1/stt.

It is not accessible through the Grok chat interface — the chat does not accept audio files. ScribeForge is a browser wrapper that calls the API on your behalf, so you get Grok STT without writing any code or creating an xAI account.

Supported formats

FormatExtensionCommon source
MP3.mp3Podcasts, downloaded audio, voice recorders
WAV.wavStudio recordings, legal dictation
M4A.m4aiPhone voice memos, Zoom/Teams audio
MP4.mp4Screen recordings, video calls
OGG / Opus.ogg/.opusWhatsApp voice notes, browser recordings
FLAC.flacLossless/archival audio
AAC.aacAndroid recordings, streaming captures
MKV.mkvGame clips, Loom recordings

How to use Grok STT online — 3 steps

  1. Go to scribeforge.tech — works in any browser, desktop or mobile.
  2. Drop your audio file onto the upload zone. Any format from the table above, up to 25 MB.
  3. Get the transcript. Grok STT processes in 10–30 seconds and returns full text with timestamps and speaker labels. Copy or download as .txt.

Grok STT vs other online STT tools

ToolEngineFree tierAPI key requiredSpeaker labels
ScribeForgeGrok scribe_v22/dayNoYes
Whisper.netOpenAI WhisperLimitedNoNo
Deepgram ConsoleNova-2$200 creditYesYes
AssemblyAI PlaygroundUniversal-25h/monthYesYes

Common questions

Does Grok STT work for non-English audio?

Yes. Grok STT auto-detects the language from 25+ supported languages including Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese (Mandarin), Arabic, and more. No language selector required.

How accurate is Grok speech-to-text?

On xAI's phone-call benchmark, Grok scribe_v2 scores 5.0% WER — versus ElevenLabs at 12.0%, Deepgram at 13.5%, and AssemblyAI at 21.3%. For general speech (podcasts, meetings), accuracy is comparable to Whisper large-v3.

Is my audio stored after transcription?

No. The file is forwarded to xAI's API for transcription and deleted immediately. ScribeForge does not store audio or text beyond your browser session.

What happens after 2 free uses?

After 2 free daily transcriptions, ScribeForge shows a paywall. A one-time pack (50 transcriptions) or a monthly unlimited plan is available. The daily limit resets at midnight UTC.

Use Grok STT online — free, no account needed.

Open Grok STT →

No account · No API key · No credit card · 2 free/day

Related