Powered by xAI Grok

Audio Transcription &
Text-to-Speech
Powered by Grok AI

Drop an audio file and get an accurate transcript in seconds. Or type text and hear it in 5 Grok voices. 2 free uses/day — no account, no credit card.

🎙 Transcribe audio
🔊 Text to speech
Drop audio here or click to browse

MP3, WAV, FLAC, M4A, OGG, OPUS, WEBM, AAC · up to 25MB · transcript ready in seconds

or
0 / 500 chars
Eve
warm
Ara
calm
Rex
bold
Sal
clear
Leo
deep
2 free uses today

License key (50 credits or monthly)

Simple, honest pricing

No account. License key delivered instantly at checkout — paste it in the box below and start immediately.

Free

$0
  • 2 transcriptions / day
  • 2 TTS generations / day
  • Up to 25MB audio
  • 500 chars for TTS
  • No account needed

Monthly

$19 /mo
  • Unlimited (200/day cap)
  • 5000 chars for TTS
  • All voices unlocked
  • Priority processing
  • Cancel anytime

~$0.10/day for unlimited · less than a coffee

How it works

1

Drop your file

Upload any audio format up to 25MB. No account, no email.

2

Grok transcribes

xAI's Grok STT model processes the audio with high accuracy.

3

Copy or download

Get your transcript with speaker segments and timestamps.

4

Pay when ready

Buy a credit pack or subscribe monthly if you need more uses.

FAQ

What audio formats are supported?

MP3, WAV, FLAC, M4A, OGG, OPUS, WEBM, AAC, AIFF and MP4 video (audio extracted). Maximum 25MB per file.

Do you store my audio?

No. Audio is processed in memory and the temporary file is deleted immediately after transcription. We never persist your audio files.

What happens to my license key?

Your license key is delivered instantly at checkout and stored only in your browser's local storage. We don't ask for an account or email (email is optional at checkout to receive a backup copy of your key).

Do credits expire?

No. The 50-credit pack never expires. Monthly subscriptions renew each month and you can cancel anytime.

What AI model powers this?

Speech-to-text is powered by xAI's Grok STT model. Text-to-speech uses xAI's Grok TTS model with 5 natural voices.

How accurate is the transcription?

Grok STT is competitive with the best models on the market, supporting 50+ languages. Accuracy varies by audio quality and accent — clear recordings typically yield >95% accuracy.