Whishper: a complete transcription suite.

pluja@lemmy.world · 1 year ago

Whishper: a complete transcription suite.

webghost0101@sopuli.xyz · 1 year ago

Does this need to connect to openai or does it function fully independently? Its for offline use.

pluja@lemmy.world · 1 year ago

No, it’s completely independent, it does not rely on any third-party APIs or anything else. It can function entirely offline once the models have been downloaded.

pcouy@lemmy.pierre-couy.fr · 1 year ago

The readme mentions “transcription time on CPU” so it’s probably running locally

fmstrat@lemmy.nowsci.com · 1 year ago

How does it compare to https://github.com/guillaumekln/faster-whisper?

I’ve been using Faster Whisper for a while locally, and its worked out better than raw whisper and benchmarks really well. Just curious if there are any reasons to switch.

pluja@lemmy.world · 1 year ago

Whishper uses faster-whisper in the backend.

Simply put, it is a complete UI for Faster-Whisper with extra features like transcription translation, edition, download options, etc…

ares35@kbin.social · 1 year ago

how does whisper do transcribing technical documents. like for lawyers, doctors, engineers and what not? or speakers with heavy accents?

pluja@lemmy.world · 1 year ago

Whisper models have a very good WER (word error ratio) for languages like Spanish, English, French… if you use the english-only models it also improves. Check out this page on the docs:

https://whishper.net/reference/models/#languages-and-accuracy

Axiochus@lemmy.world · 1 year ago

Oh, awesome! Does it do speaker detection? That’s been one of my main gripes with Whisper.

pluja@lemmy.world · edit-2 1 year ago

Unfortunately, not yet. Whisper per se is not able to do that. Currently, there are few viable solutions for integration, and I’m looking at this one, but all current solutions I know about need GPU for this.

jherazob@kbin.social · 1 year ago

VERY understandable, requiring a GPU would limit it’s application and spread, i hope a good GPU-less solution is found eventually

micha@lemmy.sdf.org · 1 year ago

Congratulations on the launch and thanks for making this open-source! Not sure if this supports searching through all transcriptions yet, but that’s what I’d find really helpful. E.g. search for a keyword in all podcast episodes.

pluja@lemmy.world · 1 year ago

That’s a great idea! I’ll attempt to implement that feature when I find some time to work on it.

Midas@ymmel.nl · 1 year ago

Awesome will give this a try

Morethanevil@lemmy.fedifriends.social · 1 year ago

I saw your project on Codeberg before. Then it was whisper plus. Since whisper+ it did not work anymore for me. I uploaded a file and it did not start. The old whisper worked. Did not try it for months anymore with whisper plus.

Maybe I give it another try. Can I use bind mounts or are there special permissions? Anyway thanks for your work.

pluja@lemmy.world · 1 year ago

Whisper+ had some problems, that’s why I rewrote everything. This new version should fix almost (maybe there are some bugs I haven’t found) everything.

If you take a look at the docker-compose file, you’ll see it is already using bind mounts. The only special permission needed is for the LibreTranslate models folder, which runs as non-root with user 1032.

Rikudou_Sage@lemmings.world · edit-2 1 year ago

Nice, congrats!

a meme with a photo of Richmond Valentine from Kingsman, the bottom text says whishper

orizuru@lemmy.sdf.org · 1 year ago

Congrats, and thank you for releasing this!

Maybe there’s a couple of personal projects I could use it for…

Whishper: a complete transcription suite.

Whishper: a complete transcription suite.

GitHub - pluja/whishper: Transcribe any audio to text with an easy UI. Powered by OpenAI's Whisper, LibreTranslate, Sveltekit and Golang.