andrey_vasenin1

This is how to generate perfect transcripts for audio podcasts for free
Today, I have tried a python package called whisper for transcribing a 30 minute German podcast from SPIEGEL.
This package is developed by OpenAI, the same company that has developed the famous ChatGPT.
To use it, I have downloaded the audio, read an article on whisper, installed a few packages, copied the lines of code from the article (https://towardsdatascience.com/transcribe-audio-files-with-openais-whisper-e973ae348aa7) and launched the script. The script downloads the neural network on the computer and runs it offline on your hardware. It took just 1,5 hours to transcribe a 30 minute audio on my laptop with Intel Core i5-1235U. And I am very much satisfied with the result. What differs this package from other available free speech-to-text converters is the ability to use punctuation.
Below is an excerpt from the beginning of the text I got:
Unser Gehirn hat keine Speicherfunktion für eine Momentaufnahme. Den Moment erleben wir wirklich nur, wenn wir uns darauf konzentrieren. Sonst ist er aus unserem Leben, aus unserer Biografie ein für allemal verschwunden. Und das ist ja tragisch. Ideen für ein besseres Leben haben wir alle. Aber wie setzen wir sie im Alltag um? In diesem Podcast treffen wir jede Woche Menschen, die uns verraten, wie es klappen kann. Willkommen zu smarter leben. Ich bin Lenne Kafka und diesmal spreche ich mit Volker. Hallo, ich bin Volker Kitz. Ich habe ein Buch über Konzentration geschrieben und bin dafür in ein Schweigeseminar in den Himalaya gereist. Wissen Sie noch, was ich eben gerade im Intro erzählt habe? Oder was da für Musik drunter lief?
SUMMARY OF THE BELOW DISCUSSION
- whisper.cpp (same whisper rewritten in C/C++ to run faster on CPU) Comment of bamboozled has detailed instructions. Link: https://github.com/ggerganov/whisper.cpp
- freesubtitles.ai (a free and easy to use website with a simple interface that would run whisper in its server for you) Link: https://freesubtitles.ai/
- revoldiv.com (Suggested by : "It's free, just upload video or audio to it. No account required. No trial limitations as of yet.") https://revoldiv.com/
- Whisper now has an API, thus no need to run models on your hardware, as suggested by . https://openai.com/blog/introducing-chatgpt-and-whisper-apis
- How to run whisper in a free virtual machine through Google Colabs (by ) https://bytexd.com/how-to-use-whisper-a-free-speech-to-text-ai-tool-by-openai/
- MacWhisper (partially free) https://goodsnooze.gumroad.com/l/macwhisper
- Conformer-1 from AssemblyAI. Free to use, nothing to install, no need to sign up. Might be even better at transcribing than whisper. You can either give it a youtube link or upload an audiofile. https://www.assemblyai.com/playground/source
- A bunch of other tools https://www.futuretools.io/?tags-n5zn=speech-to-text
Robust Speech Recognition via Large-Scale Weak Supervision - GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision