// the find
davabase/whisper_real_time
Real time transcription with OpenAI Whisper.
A single-file demo that does real-time speech transcription by continuously recording audio chunks and feeding them to OpenAI's Whisper model. It's a proof of concept, not a library — useful for understanding the approach, not for building on top of.
The core trick — buffering raw audio bytes across recording windows to avoid choppy sentence splits — is genuinely smart for a demo. Public domain license means zero friction for any use. Simple enough to read in five minutes and actually understand what's happening. Decent starting point if you want to roll your own without a framework.
One Python file, no tests, no packaging, no releases — this is a weekend hack that hasn't been touched seriously in years. Whisper local inference is slow on CPU and the demo doesn't address this; on anything without a GPU you'll get noticeable lag that defeats the 'real time' label. No VAD (voice activity detection), so it transcribes silence and background noise constantly. The 2938 stars are mostly from people who starred it the week it came out — the project has no real maintenance story.