Running local models is good now

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 3 hours ago

Running local models is good now

pimat@feddit.org · 2 hours ago

Stopped reading at 64GB of RAM in a M2 Mac… That’s not a real world example to start.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 hour ago

You can run the Gemma 4 and Qwen3.5 MoE models with as little as 12 GB of VRAM at 30-40 tps (Q4/Q5), and they both blow GPT-4o and DeepSeek R1 out of the water. But 64gb RAM is also not really out of scale with the cost of a shop tool in other trades. If you’re a professional that’s confident in a positive return on the investment, or just a hobbyist with the luxury budget for a “shop” that cost is well within consumer market. That’s not everybody, of course, but it’s not some inconceivable fantasy.

The key point is that local models continue to get more efficient and usable. You need high end consumer grade hardware today, but given how fast improvements are happening, it’s entirely likely that you’ll be able to get the same capability on even smaller hardware in a few months.