Transcription · April 2026

Audio Transcription with Timestamps — Free Online Tool (2026)

By ScribeForge · April 20, 2026 · 5 min read

A plain text transcript is useful. A transcript with timestamps is far more useful — you can jump directly to any moment in the audio, create linked table of contents, generate subtitles, and extract quotes with a verifiable source time. ScribeForge returns phrase-level timestamps for every transcription, free.

What the output looks like

ScribeForge returns segments — short phrases grouped by natural speech pauses — each with a start and end time in seconds:

[0.18 → 2.03]Good morning everyone, thanks for joining.

[2.45 → 5.90]Today we're going to cover the Q1 results

[6.10 → 9.75]and discuss what they mean for our roadmap this quarter.

[10.20 → 14.30]Let's start with revenue. We hit $2.4 million,

[14.35 → 17.60]which is 12% above our forecast.

Segments are split on natural pauses (silence >0.4 seconds) or sentence-ending punctuation. This keeps the timing granular enough to be useful without fragmenting mid-sentence.

How to get timestamped transcripts

Go to scribeforge.tech — no account needed.
Upload your audio file (MP3, WAV, M4A, OGG, FLAC, WEBM, MP4 — up to 100 MB).
Click Transcribe. Results include the full text plus timestamped segments.
The segments are shown in the UI and included in the JSON response if you use the API.

Use cases for timestamped transcripts

Interview transcription

When you're transcribing a journalistic or research interview, timestamps let you find the exact quote later. Instead of scrubbing through audio, search the text for a keyword and jump to that second. Great for fact-checking and attribution.

Podcast show notes

Use timestamps to create a chapter list for your episode. "00:04:32 — Guest explains their background" or "00:21:07 — Main argument on climate policy." Listeners can jump to the part they care about; this also helps with SEO since chapter text is indexed.

Video subtitle creation

The segment timestamps are close enough to subtitle timecodes that you can adapt them into an SRT file manually or with a script. Each segment becomes one subtitle entry. Exact timing may need minor adjustment, but you're starting from 90% accuracy instead of zero.

Meeting minutes

For recorded meetings, timestamps help you distinguish who said what and when — especially when combined with the segment text that names the speaker. "At 14:32, the product team confirmed the API deadline."

Legal and compliance transcription

Depositions, recorded calls, and arbitration recordings need verifiable timestamps for evidentiary purposes. The segment timestamps provide a reference that maps directly back to the original audio file's position.

Word-level timestamps via the API

The web interface shows phrase-level timestamps. The underlying xAI Grok STT API (which ScribeForge uses) also returns word-level timestamps — every single word with its own start and end time. These are available if you call the API directly:

"words": [
  {"text": "Good",     "start": 0.18, "end": 0.44},
  {"text": "morning",  "start": 0.44, "end": 0.81},
  {"text": "everyone", "start": 0.81, "end": 1.40},
  ...
]

Word-level timestamps are useful for karaoke-style highlighting, precise subtitle sync, or when you need to cut the audio at exact word boundaries. See the developer guide for full API documentation.

How accurate are the timestamps?

The timestamps come directly from xAI's Grok STT word-level alignment, trained on forced-alignment data. In practice, phrase-level timestamps are accurate to within ±0.2 seconds. Word-level timestamps are slightly less reliable on fast speech or heavy accents but still within ±0.5 seconds in most cases.

Comparison with other tools

Tool	Timestamp level	Free tier	Account required
ScribeForge	Phrase + word (API)	Free preview	No
Otter.ai	Word	600 min/month	Yes
Whisper (local)	Word + segment	Unlimited	No (GPU needed)
Descript	Word	1 hr/month	Yes
Rev	Word	No	Yes (paid only)