Hi everyone!

A few days ago I released Whishper, a new version of a project I’ve been working for about a year now.

It’s a self-hosted audio transcription suite, you can transcribe audio to text, generate subtitles, translate subtitles and edit them all from one UI and 100% locally (it even works offline).

I hope you like it, check out the website for self-hosting instructions: https://whishper.net

  • webghost0101@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    14
    ·
    10 months ago

    Does this need to connect to openai or does it function fully independently? Its for offline use.

    • pluja@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      15
      ·
      10 months ago

      No, it’s completely independent, it does not rely on any third-party APIs or anything else. It can function entirely offline once the models have been downloaded.

    • pluja@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      10 months ago

      Whishper uses faster-whisper in the backend.

      Simply put, it is a complete UI for Faster-Whisper with extra features like transcription translation, edition, download options, etc…

  • ares35@kbin.social
    link
    fedilink
    arrow-up
    7
    ·
    10 months ago

    how does whisper do transcribing technical documents. like for lawyers, doctors, engineers and what not? or speakers with heavy accents?

  • Axiochus@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    10 months ago

    Oh, awesome! Does it do speaker detection? That’s been one of my main gripes with Whisper.

    • pluja@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      10 months ago

      Unfortunately, not yet. Whisper per se is not able to do that. Currently, there are few viable solutions for integration, and I’m looking at this one, but all current solutions I know about need GPU for this.

      • jherazob@kbin.social
        link
        fedilink
        arrow-up
        2
        ·
        10 months ago

        VERY understandable, requiring a GPU would limit it’s application and spread, i hope a good GPU-less solution is found eventually

  • micha@lemmy.sdf.org
    cake
    link
    fedilink
    English
    arrow-up
    6
    ·
    10 months ago

    Congratulations on the launch and thanks for making this open-source! Not sure if this supports searching through all transcriptions yet, but that’s what I’d find really helpful. E.g. search for a keyword in all podcast episodes.

    • pluja@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 months ago

      That’s a great idea! I’ll attempt to implement that feature when I find some time to work on it.

  • Morethanevil@lemmy.fedifriends.social
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 months ago

    I saw your project on Codeberg before. Then it was whisper plus. Since whisper+ it did not work anymore for me. I uploaded a file and it did not start. The old whisper worked. Did not try it for months anymore with whisper plus.

    Maybe I give it another try. Can I use bind mounts or are there special permissions? Anyway thanks for your work.

    • pluja@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      6
      ·
      10 months ago

      Whisper+ had some problems, that’s why I rewrote everything. This new version should fix almost (maybe there are some bugs I haven’t found) everything.

      If you take a look at the docker-compose file, you’ll see it is already using bind mounts. The only special permission needed is for the LibreTranslate models folder, which runs as non-root with user 1032.

  • orizuru@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    Congrats, and thank you for releasing this!

    Maybe there’s a couple of personal projects I could use it for…