Use case · Updated April 29, 2026

How to Transcribe a Discord Call — Free

Discord doesn't record voice channels natively. Here's the cleanest workflow: Craig bot for recording, ScribeForge for transcription. End-to-end, free.

Yes — any Discord voice call can be transcribed, but Discord itself doesn't record. The standard combo is Craig (free recording bot) → ScribeForge (transcription). Craig captures multi-track audio (one file per speaker), and ScribeForge transcribes each track in 10–30 seconds. Result: a cleaner transcript workflow with less reliance on automatic speaker attribution.

Heads up — recording laws vary by jurisdiction. In two-party-consent regions (most of the EU, several US states), every participant must agree before recording. Announce the recording when Craig joins.

Why Craig + ScribeForge beats single-file recording

The naive approach is to use OBS Studio or system audio recording — gives you one mixed audio file with all speakers overlapping. Grok STT can try to separate this with speaker labels, but review gets harder on noisy multi-speaker mixes.

Craig records each speaker on a separate track. You get one FLAC per Discord username. Transcribe each FLAC individually, then merge by timestamp at the end. That workflow is usually easier to review than relying on automatic speaker attribution in a mixed file.

The 5-step workflow

  1. Add Craig — visit craig.chat and invite the bot to your server. Admin permissions required. Free tier is enough for casual calls.
  2. Start recording — in your voice channel, type :craig:, join. Craig joins as a participant and starts capturing. Announce the recording to all participants.
  3. End recording — type :craig:, leave or end the call. Craig DMs the recording-host a download link.
  4. Pick "Multiple FLAC tracks" — one file per speaker, lossless. Filename includes the Discord username, so you already know who is who.
  5. Transcribe each track — drop each .flac into ScribeForge. Result is a clean per-speaker transcript. Combine by timestamp at the end.

Compress FLAC before upload (100 MB cap)

FLAC is lossless and large. A 1-hour call produces 30–100 MB per speaker, sometimes more. Convert to OGG Opus at 32 kbps mono first if the file is near or over the cap:

ffmpeg -i speaker.flac -c:a libopus -b:a 32k -ac 1 speaker.ogg

32 kbps Opus mono is usually enough for speech transcription while keeping files much smaller. Result is typically 5–10× smaller and fits better under the cap.

Combining per-speaker transcripts

Each ScribeForge transcript includes phrase-level timestamps (start, end, text). For a unified call transcript:

  1. Save each transcript as speaker-{username}.json (download includes timestamps).
  2. Merge into a single timeline by start-time, prefixing each line with the speaker name.
  3. One Python snippet does this:
import json, glob

events = []
for fn in glob.glob("speaker-*.json"):
    speaker = fn.replace("speaker-", "").replace(".json", "")
    for seg in json.load(open(fn))["segments"]:
        events.append((seg["start"], speaker, seg["text"]))

events.sort()
for t, who, text in events:
    print(f"[{t:7.2f}] {who}: {text}")

Common questions

Are there alternatives to Craig?

OBS Studio captures system audio + your mic to a single file. Loses per-speaker separation, so diarization becomes Grok's job (works, just less clean). Browser-based discord clients can also be captured by browser extensions like "Voice Recorder" — same caveats.

What's the legal answer on consent?

Two-party-consent jurisdictions (most of EU, US states like California, Florida, Pennsylvania, …): all participants must consent. One-party-consent: only the recorder needs to know. Best practice always: announce the recording when Craig joins. ScribeForge does not record — only transcribes whatever audio you provide.

How accurate is Grok on Discord audio?

Discord ships voice over Opus, which is usually workable for transcription. Per-speaker tracks from Craig are generally easier to review than a mixed track, while overlapping multi-speaker audio is harder regardless of model. For important material, review the transcript before relying on it.

Is the audio stored on ScribeForge?

No — sent to xAI's Grok STT API for processing, deleted immediately. The transcript exists only in your browser session.

Drop a Discord call recording — get a clean transcript.

Transcribe Discord call free →

No account · No credit card · 2 free uses/day per IP

Related guides