Use case · Updated April 30, 2026

How to Transcribe Audio with Grok — No API Key, Free

Use the xAI Grok STT engine in your browser without an account, API key, or code. Drop your file and read it in 30 seconds.

No API key needed. ScribeForge is a browser wrapper around the xAI Grok STT API. You upload your audio, Grok transcribes it, and you get the text back — with timestamps and speaker labels. Two uses free every day, no signup.

Drop your audio file and transcribe it now.

Transcribe with Grok — free →

No account · No API key · 2 free uses/day per IP

Why you can use Grok STT without an API key

The Grok Speech-to-Text API launched on April 18, 2026. Using it directly requires an xAI account, API access, and writing multipart upload code. ScribeForge handles all of that — you get the same Grok STT output in a browser drag-and-drop interface.

Under the hood, ScribeForge calls POST https://api.x.ai/v1/stt with the scribe_v2 model, parses the word-level timestamps, groups them into readable segments, and shows them in your browser. The audio is processed and then immediately discarded — nothing is stored.

Supported audio formats

FormatExtensionTypical source
MP3.mp3Podcasts, voice recordings, downloaded audio
WAV.wavStudio recordings, interviews, dictation
M4A.m4aiPhone voice memos, Zoom/Teams audio exports
MP4.mp4Screen recordings, YouTube downloads
OGG / Opus.ogg/.opusWhatsApp voice notes, browser recordings
FLAC.flacLossless audio, archival recordings
AAC.aacAndroid voice notes, streaming captures
MKV.mkvVideo calls, game recordings

All 12 Grok STT formats are supported. File size limit: 25 MB per upload — roughly 25 minutes of mono MP3 at 128 kbps.

Step-by-step: transcribe audio with Grok in 3 steps

  1. Open scribeforge.tech in any browser — desktop or mobile, no install.
  2. Drop your audio file onto the upload zone, or click to browse. Any format from the table above.
  3. Read the transcript. Grok STT returns the full text in 10–30 seconds, with phrase-level timestamps and per-speaker labels. Copy or download as .txt.

What Grok STT returns

Common questions

Do I need an xAI account to transcribe?

No. ScribeForge provides access to Grok STT without requiring you to sign up for an xAI account or manage an API key. Free tier: 2 transcriptions per day per IP. For unlimited use, a paid plan is available.

Is this the same as talking to Grok in the chat interface?

No. The Grok chat interface does not accept audio uploads — it is a text and image interface. The Grok STT API (POST /v1/stt) is a separate product launched April 18, 2026. ScribeForge uses that API.

How accurate is Grok STT?

On xAI's phone-call benchmark, Grok reports a 5.0% word error rate — versus ElevenLabs at 12.0% and Deepgram at 13.5%. For general speech (podcasts, meetings, voice memos), accuracy is comparable to Whisper large-v3.

Is my audio stored anywhere?

No. Audio is forwarded to xAI's Grok STT API for processing and deleted immediately after the transcript is returned. ScribeForge does not store audio or transcripts beyond your browser tab.

What if my file is over 25 MB?

Split it with ffmpeg: ffmpeg -i input.mp3 -f segment -segment_time 1500 -c copy chunk%03d.mp3. This produces 25-minute chunks you can transcribe individually.

Ready to transcribe your audio with Grok?

Open ScribeForge — free →

No account · No API key · No credit card

Related guides