If you are evaluating xAI Grok Speech-to-Text, there are really two pricing layers to understand: the raw xAI API cost, and the packaged ScribeForge browser workflow that lets you use the same model without touching an API key.
Quick answer: xAI charges $0.10/hour for batch STT and $0.20/hour for real-time streaming. The xAI batch API accepts up to 500 MB per file. ScribeForge repackages Grok STT as 2 free previews/day, $9 once for 50 transcripts, or $19/month with a 200/day cap, and keeps the workflow entirely in the browser.
The underlying xAI pricing is simple:
That is infrastructure pricing. It works well if you are a developer, already have xAI credentials, and want to integrate Grok STT directly into a product, script, or backend job.
| Layer | Price model | Best for |
|---|---|---|
| xAI batch API | $0.10/hour of audio | Backend jobs, custom tools, scripts |
| xAI streaming API | $0.20/hour of audio | Real-time apps, voice agents, live capture |
| ScribeForge | 2 free previews/day, $9/50, $19 monthly | No-code browser workflow, occasional and prosumer use |
The raw model and the browser product do not have the same constraints.
Up to 500 MB per batch request. Better if you control your own upload flow and can handle retries, auth, and parsing.
Up to 100 MB per file in the browser. Better if you want to drop a file and get usable text immediately.
2 previews/day, no account or card. Good for checking transcript quality before paying for full output.
For common speech audio, 100 MB is already enough for many real jobs. A 128 kbps MP3 can fit roughly 90-100 minutes of spoken audio inside that cap. For larger files, compress to MP3 or split with ffmpeg first. If you need the full raw ceiling, the xAI API remains the more flexible route.
ScribeForge is not metered per minute. It is packaged around fast browser usage and simple buying.
| Plan | Price | What you get |
|---|---|---|
| Free | $0 | 2 free previews/day, up to 100 MB, no account or API key |
| 50 Credits | $9 one-time | 50 full transcripts, credits never expire |
| Monthly | $19/mo | Up to 200 transcripts/day, cancel anytime |
The commercial difference is not just the sticker price. With ScribeForge, you are paying for the packaging: browser upload, transcript view, timestamps, search, jump-to-time workflow, and no signup flow.
| Question | xAI API direct | ScribeForge |
|---|---|---|
| Need your own API key? | Yes | No |
| Need code or backend setup? | Usually yes | No |
| Max single upload | 500 MB | 100 MB |
| Best for | Developers and product teams | Founders, operators, occasional users, fast validation |
| Pricing shape | Usage-based | Simple fixed packages |
| Transcript workflow in browser | You build it | Included |
If your real need is "I want the transcript now and I do not want to wire a speech API," ScribeForge is the better product shape. If your real need is "I am building this into my own system," use the raw API.
You need streaming, backend automation, your own storage rules, or the full 500 MB batch ceiling.
You want Grok STT without API setup, with timestamps and transcript workflow already done for you.
You are evaluating Grok STT quality first, then may later decide whether to integrate the raw API.
Try Grok STT in the browser first. If the transcript is good enough, then decide whether you need the raw API at all.
Try Grok STT Free →No account · 2 free/day · 100 MB uploads
$0.10 per hour for batch transcription and $0.20 per hour for real-time streaming.
The xAI batch API supports up to 500 MB per request. ScribeForge currently supports up to 100 MB per browser upload.
Yes. ScribeForge lets you use Grok STT in the browser without handling xAI auth, API keys, or multipart upload code yourself.
Yes. ScribeForge offers 2 free previews per day with no account required.
Use the raw API if you need custom integration, streaming, or larger uploads. Use ScribeForge if you mainly need fast transcript output in the browser.