Free · No account needed

Transcribe audio with Grok,
instantly.

Drop any audio file and get an accurate transcript — with timestamps and speaker labels — in seconds. ScribeForge uses xAI Grok to transcribe audio to text with state-of-the-art accuracy across 25+ languages. No account needed, no data retained.

xAI Grok STT 25+ languages Audio deleted immediately Stripe-secured
Quick answer · Updated April 28, 2026

Can Grok (xAI) transcribe audio?

Yes. xAI launched the Grok Speech-to-Text API on April 18, 2026 — it transcribes audio in 25+ languages with word-level timestamps and speaker diarization. ScribeForge runs the Grok STT API in your browser: drop an audio file, get a transcript in seconds. No code, no account.

↓ Try free below Read the full answer → Supported formats guide →
Developer Guides

Grok Audio Transcription & TTS Guides

View all →

Explore our developer guides on xAI Grok STT, TTS voices, and audio transcription — including comparisons with Whisper and Deepgram, format tips, and API walkthroughs.

STT Comparison
Grok STT vs Whisper vs Deepgram in 2026
Developer Guide
xAI Grok STT & TTS API: Developer Guide
xAI API
xAI Grok TTS Voices: Eve, Ara, Rex, Sal, Leo
Audio Formats
Grok STT: Supported Audio Formats
Drop audio here, or tap to browse MP3 · WAV · FLAC · M4A · OGG · OPUS · WEBM · AAC · AIFF · MP4 · max 25 MB
Already purchased a plan?

Simple pricing

No account. Your key is ready the moment you pay.

Free
$0
Instant preview, always.
  • 200-character preview per file
  • Up to 25 MB audio
  • 25+ languages, auto-detected
  • Speaker labels + timestamps
  • No account needed
Monthly
$19 /mo
200 transcriptions/day. Cancel anytime.
  • Unlimited (200/day cap)
  • Up to 25 MB per file
  • Timestamps + speaker labels
  • No per-use credit tracking
  • Cancel anytime
~$0.10/day

How it works

1

Drop your file

Any audio format up to 25 MB. No account needed.

2

Grok transcribes

xAI Grok STT processes with high accuracy across 25+ languages.

3

Copy or download

Full transcript with timestamps and speaker labels. Ready in seconds.

4

Upgrade when ready

Buy a credit pack or subscribe. Key delivered instantly by email.


FAQ

What audio formats are supported?+

MP3, WAV, FLAC, M4A, OGG, OPUS, WEBM, AAC, AIFF and MP4. Maximum 25 MB per file. See the complete Grok audio formats guide for quality tips and conversion instructions.

Do you store my audio?+

No. Audio is processed in memory and discarded immediately after transcription. Nothing is stored.

What happens to my license key?+

Stored only in your browser's local storage. No account needed — email at checkout is optional.

Do credits expire?+

No. The 50-credit pack never expires. Monthly plans renew each month and can be cancelled anytime.

What AI model powers this?+

xAI's Grok Speech-to-Text API — the same audio stack powering Grok Voice, Tesla in-car voice and Starlink customer support. Read the independent Grok STT vs Whisper vs Deepgram benchmark.

Does it detect multiple speakers?+

Yes. Grok STT returns speaker labels and timestamps per segment automatically. See how timestamped transcripts work.

What languages are supported?+

25+ languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Arabic, Hindi and more. Detected automatically.