The circle of life

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 day ago

The circle of life

dragnucs@lemmy.ml · 1 day ago

One jas to Knox how expensive LLMs are by trying to run one at home. Tried running it myself, figured out I need more than 128GiB of RAM and thousand dollars in grahics card. Figured out a $5 openai per month is cheaper, and also understood they are burning money by providing a free service and 20$ subscription.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 day ago

You should be able to get very decent performance with 128gb vram running Qwen 3.6 with something like https://github.com/itigges22/ATLAS especially if you run MTP https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF

A friend of mine gets something like 50 tokens a second with it, and output quality is quite decent.

dragnucs@lemmy.ml · 1 day ago

How does it compare to largest deepseek ans Claude opus 4.6? I hot used to blazing fast speed and accurate results. I’m not buying a server and 128 GB of RAM just to run a model similar to gpt-4.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 day ago

ATLAS has some benchmarks in the repo, and it’s comparable to opus 4.6, you don’t actually even need 128gb model for that. An 8 bit quantized model will run with around 32gb and still perform quite well.

OwOarchist@pawb.social · 1 day ago

Yeah, but depending on your location (and usage), you might be burning more than $5/mo in electricity to run that shit. Not to mention the costs of buying all that hardware … especially at current inflated rates.

If you have to buy 128GB of RAM in 2026, it’s going to be a long time before you come out ahead vs paying $20/mo for some AI subscription.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 day ago

Yeah that’s true, depending on the electricity costs, you could be better on a subscription. Especially with DeepSeek, which is incredibly cheap now.