Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • SuspiciousCarrot78@aussie.zoneOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago

    How ancient is ancient? TTS and STT is much lighter than llm…you might have more capability than you think, especially if you’re doing batch processing like that.

    • hexagonwin@lemmy.today
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 days ago

      a haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…

      • SuspiciousCarrot78@aussie.zoneOP
        link
        fedilink
        English
        arrow-up
        0
        ·
        10 days ago

        Ha. You were doing inference on CPU on a haswell era. Been there, done that.

        OTOH…whisper.cpp is heavily optimised for it.

        Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

        Fire Whisper small or medium overnight and wake up to searchable text.

        PS: if you want a good fast little llm, something like Qwen 3.5 2B will work well on the Xeon.