Tell me the name of person who needs an AI for setting up timers. It is useless as ****!
The irony of voice assistants like Siri and OkGoogle being more efficient before they introduced LLMs.
Okay, so, in case the headline is confusing anyone else, it’s literal. Like, you know how there are those cringe-ass Alexa ads that are about how it does AI language processing and assistant shit? Yeah, ChatGPT can’t I guess.
It seems out of scope for the tool, IMHO.
The public fundamentally misunderstands this tech because salesman lied to them. An LLM is not AI. It just says the most likely thing based off what is most common in its training data for that scenario. It can’t do math or problem solve. It can only tell you what the most likely answer would be. It can’t do function things. It’s like Family Feud where it says what the most people surveyed said.
I explain it as asking 100 people to Google something and taking the most common answer.
Yeah, that’s basically exactly what family feud does.
Yep but instead of “name something a woman keeps in her purse” it’s “write my legal document” or “is it ok to lick a lamp socket”
Great question! The answer to all three of your queries is “yes.” Would you like me to search for the nearest lamp socket?
Some of them will “do math” but not with the LLM predictor, they have a math engine and the predictor decides when to use it. What’s great is when it outputs results, it’s not clear if it engaged the math engine or just guessed.
when it outputs results, it’s not clear if it engaged the math engine or just guessed
That depends on the harness though. In the plain model output it will be clear if a tool call happened, and it depends on the application UI around it whether that’s directly shown to the user, or if you only see the LLM’s final response based on it.
In all the UIs i have seen, not even 1 will tell you that it called the math engine, maybe it does happn with “thinking” models but i never tried
Edit: i tries with deepseek, i don’t hsve enough math knowledge to do crazy stuff, so i did an addition lmao

Quick note on terminology, there’s no thing called a “math engine”. Most models have the ability to run custom computer code in some way as one of the “tools” they have available, and that’s what’s used if a model decides to offload the calculation, rather than answer directly.
This is what that looks like in Claude Code:

Notice the lines starting with a green dot and the text
Bash(python3...). Those are the model calling the “Bash” tool to run python code to answer the second and third question. The first question it answered (correctly, btw) without doing any tool call, that’s just the LLM itself getting it right in a straight shot, similar to DeepSeek in your example. Current models are actually good enough to generally get this kind of simple math correct on their own. I still wouldn’t want to rely on that, but I’m not surprised it got it correct without any tool calls.So I tested my more complex calculations against DeepSeek, and it seems like (at least in the Web UI) it doesn’t have any access to a math or code running tool. It just starts working through it in verbose text, basically explaining to itself how to do manual addition like you learn in school, and then doing that. Incredibly wasteful, but it did actually arrive at the correct answers.
Gemini is the only web-based AI app I thought to test right now that seems to have access to a code running tool, here’s what that looks like:

It’s hidden by default, but you can click on “Show code” in the top right to see what it did.
This is what I mean when I say the harness matters. The models are all pretty similar, but the app you’re using to interact with them determines what tools are even made available to the LLM in the first place, and whether/how you’re shown when it calls those tools.
Is a human much different? We too require tons of training and we too are prone to stupid mistakes.
Fundamentally yes and no. Original commentor could’ve saved his breath, if people wanted to be educated on AI they have plenty of resources to do so but instead they choose to remain ill informed. The difference is that humans are capable of critical thinking and conceptual connection. We are just as prone to mistakes as AI, we just have a much higher apptitude for mistakes lol. Hence the goal not being to make a perfect AI, its a much more achievable goal of making AI’s that beat us in specific fields. Then to beat us in all fields.
It’s missing features obviously (think neuroplasticity) but is that how AI differs from human intelligence, or simply a lack in the current generation?
It seems to be a flaw in both the hardware and software side of things. Hardware wise, we have yet to make chips that achieve the processing density of human brain matter. Also, heat generation becomes an issue as you try to scale smaller systems up. Software wise, we know our current neural networks dont scale up well, so we seem to be waiting on some more foundational research for more efficient algorithms. My suspicion is that we’re not really going to get true General Superintelligence until we start manufacturing chips that incorporate living neurons, it just really seems cheaper to use already existing computing systems than to design your own architecture.
I know Lemmy hates AI with a fiery passion (and I too hate it for various reasons), but the ability to make this sort of prediction in a way far more stable than whatever else came before with natural language processing (fancy term of the day for those who havem’t heard of it), and however inefficiently built and ran it is, is useful if you can nudge it enough in a certain direction. It can’t do functional things reliably, but if you contain it to only parse human language and extract very specific information, show it in a machine-parsable way, and then use that as input for something you can program, you’ve essentially built something that feels like it can understand you in human language for a handful of tasks and carry out those tasks (even if the carrying out part isn’t actually done by an LLM). So pedantically, it’s not AI, but most people not in tech don’t know or care about the difference. It’s all magic all the way down like how computers should just magically do what they’re thinking of. That’s not changed.
My point though, and this isn’t targeting you specifically dear OC, is that we can circlejerk all we want here, but echoing this oversimplification of what LLMs can do is pretty irrelevant to the bigger discourse. Call these companies out on their practices! Their hypocrisy! Their indifference to the collapse of our biosphere, human suffering, letting the most vulnerable to hang high and dry!
Tech is a tool, and if our best argument is calling a tool useless when it’s demonstrably useful in specific ways, we’re only making a fool of ourselves, turning people away from us and discouraging others from listening to us.
But if your goal is to feel good by letting one out, please be my guest.
Peace
The only way to know if LLM output is accurate is to know what an accurate output should look like, and if you know that, you don’t need an LLM. If you don’t know what an accurate output should look like, an LLM is equally likely to confidently lie to you as it is to help you, making you dumber the more you use it. The only other situation is if you know what an accurate output should look like, but you want an inaccurate one, which is a bad thing to encourage.
“Demonstrably useful” is a lie. It’s a blatant and obvious lie. LLMs are so actively detrimental to their users, and society as a whole, that calling them useless is being generous. And even if they were the most beneficial thing on the planet, there is still no reason to use the billionaire’s toxic Nazi plagiarism machine.
The only way to know if LLM output is accurate is to know what an accurate output should look like, and if you know that, you don’t need an LLM
I empathize with your overall standpoint, but that’s just plain wrong. There are a lot of problems where verifying an answer is much easier for a human (or non-LLM computer program) than coming up with a correct answer.
Anything that involves language manipulation, for example. I’ll have a much easier time checking a translation from English to German for accuracy than doing the full translation myself, assuming the model gets most of it correct and I don’t have to rewrite anything major (which is generally the case with current models). Or letting an LLM proof-read a text I wrote - I can’t be sure it got everything, but the things it does find are trivial for me to verify, and will often include things that slipped past me and three other people who proof-read the same text. Less useful, but still applicable to the premise: Producing a set of words that rhyme with a given one. Coming up with new ones after the first couple that pop into your head gets pretty hard, but checking if new candidates actually do rhyme is trivially easy.
Moving on from language-stuff, finding security issues in software is a huge one - finding those is often extremely hard, but verifying them is mostly pretty straightforward if the report is well prepared. Models are just now getting good enough to reliably produce good security reports for actual issues.
Answering questions about a big codebase, where the actual value doesn’t lie in the specific response the model gives, but pointing me to the correct places in the code where I can check for myself.
Producing code or entire programs is a bit more debatable and it depends heavily on the goal and the skill level of the operator whether complete verification is actually easier than doing it yourself.
Just a couple of examples. As I said I get where you’re coming from, but completely denying any kind of utility does not help your cause at all, it just make you look like an absolutist who doesn’t know what they’re talking about.
If you know enough to verify a translation as accurate, or you have the tools to figure out an accurate translation through dictionaries or some such, then you know enough to do the translation yourself. If you don’t, then I cannot trust your translation.
And if you can’t trust the output to be comprehensive or correct, then why would you trust something like system security to an LLM? Any security analyst who deserves their job would never take that risk. You don’t cut those corners.
Quick reminder: rhyming dictionaries exist. LLMs solved a solved problem, but worse.
Once again, even if the billionaire’s toxic Nazi plagiarism machine was useful, it is so morally repugnant that it should never be used, which makes it functionally useless. This is an absolute statement, but trying to “um actually” that makes you look like either a boot-licker, a pollutant, a Nazi, a plagiarist, an idiot, or some combination of those.
I would rather look like an absolutist. How about you?
If you know enough to verify a translation as accurate, or you have the tools to figure out an accurate translation through dictionaries or some such, then you know enough to do the translation yourself.
Correct. But it’s going to take me a lot more work and time, possibly to the point of not being feasible and probably even matching the energy cost of using the LLM over the entirety of the task.
why would you trust something like system security to an LLM?
I wouldn’t. I don’t know where you got that. Adding LLM-based analysis to your toolkit to spot important issues that otherwise might not have been found is just that: an addition. Not replacing anything. And it is demonstrably useful for that at this point, there’s just no denying that.
Once again, even if the billionaire’s toxic Nazi plagiarism machine was useful, it is so morally repugnant that it should never be used, which makes it functionally useless.
My point is that if you are this confidently wrong about the capabilities of LLM-based tools, then why should I believe you to be any less wrong about the moral and ethical issues you’re raising? It looks like you’re either completely misinformed or deliberately fighting a strawman for a part of your argument, so it gives anyone on the other side an easy excuse to just not engage with the rest of it and just dismiss it entirely. That’s what I’m trying to get across here.
Surely, the energy cost to verify the translation would be the same as translating it? If you’re struggling that much, why are you translating it at all? I cannot trust your translation.
If you tell an LLM to generate reports, it will, regardless of the actual quality of the environment. It doesn’t know what’s secure and what isn’t. All you’ve shown it to do is convince the kinds of security analysts with a system so insecure as to have a LOT of good reports that their system is more secure than it is. Which is useless at best, detrimental at worst.
It’s useless for translation. It’s useless for security analysis. It’s useless for rhyming (I notice you didn’t mention that one). You’re trying so hard to prove how useful it is, and your failure demonstrates how useless it is.
You can’t condemn confident wrongness and defend LLMs. And you can’t defend the billionaire’s toxic Nazi plagiarism machine while questioning someone else’s morals. You can’t cherry-pick my argument and claim I’m the one fighting a strawman. …Well, not if you’re arguing in good faith.
Look, I’m not trying to argue against your moral stance. I’m neither saying it’s wrong nor that it’s outweighed by any usefulness, real or not. What I’m trying is get you to see that your claims about uselessness are undermining your moral argument, which would be a hell of a lot stronger if you were not hell-bent on denying any kind of utility! Because in the eyes of people that do perceive LLMs as useful (which is exactly the kind of people that need to hear about the moral issues), that just makes you seem out of touch and not worth listening to.
It’s useless for security analysis.
Have you looked at any of the four links I provided? You might be working on old data here because it’s a very recent development, but a lot of high profile open source maintainers are saying that AI-generated security reports are now generally pretty good and not slop anymore. They’re fixing actual bugs because of it, and more than ever. How can you call that useless?
Surely, the energy cost to verify the translation would be the same as translating it?
Uh, no? Have you ever translated something? Verifying a translation happens mostly at attentive reading speed, double it for probably reading it twice overall to focus on content and grammar separately, plus some overhead for correcting the occasional flaw and checking one or two things that I’m unsure about from the top of my head, so for the sake of argument let’s say three times slower than just reading normally. I don’t know about you, but three times slower than reading is still a lot faster than I would be able to produce a translation from scratch, weighing different word options against each other, how to get some flow into the reading experience, etc. If I’m translating into a language that I’m fluent but not native in that takes even longer, because the ratio between my passive and active vocabulary is worse. I can read (and thus verify) English at a much more sophisticated level than I’m able to talk or write, because the words and native idioms just don’t come to me as naturally, or sometimes even at all without a lot of mental effort and a Thesaurus. LLMs are just plain better at writing English than I have any hope of achieving in my lifetime, and I can still fully understand and verify the factual, orthographic and grammatical correctness of what they’re outputting easily. Those two things are not mutually exclusive.
It’s useless for rhyming (I notice you didn’t mention that one)
Yeah, because I’m focusing on the more relevant things. I disagree that it’s completely useless for rhyming, but it is a much weaker and more contrived point than the others, and going into that discussion would just derail things more for no added value. Also, funny that you call me out for that, when you just fully ignored two use cases I mentioned in my initial comment (LLM proofreading texts, and answering questions about unfamiliar code bases). Those have a lot of legitimate utility for someone who’s not aware of or doesn’t care about the moral issues. And once again, that’s my point here - those people will not listen if they perceive you as talking about a fictional world where LLMs are completely useless, which fails to match up with their experience.
Are you saying that you need to have perfect technical knowledge of AI to know if a person that promotes it is immoral? It looks like a non sequitur to me.
Sometimes i use AI even if i know the answer because i am a lazy person, and holy shit, i can confirm that it lies a lot and tells wrong shit
We already have tools that can give us incorrect answers in natural human language.
And they post their videos to youtube for free.
Odd because home assistant can use a local run LLM to do so?
Beacuse they probably work as agents so they dont count themselfs. They use another app to do . Chat gpt probably could also do thay if integrated properly with your phone software.
Wow, the only thing Siri is generally competent at.
I miss Siri, just asked it to open the pod bay doors, had a few laughs and moved on with my life.
My first thought as well, lol.
Lol. Why dont they ask the AI how to program an AI?
They should just vibe code the feature. They’ll have it done in an afternoon, right?
Yo dawg I heard you like AI so we put an AI in your AI so it can AI while you AI!
Even if it could, it would be an order of magnitude more inefficient in terms of convenience than the stopwatch we already have on our phones.
“Hey ChatGPT, do the thing I could have done in 3-4 clicks on my clock app.”
Not to mention the sheer wastefulness in terms of energy. A MINECRAFT REDSTONE MACHINE TIMER WOULD BE MORE EFFICIENT. (Not to mention that, unlike SOTA LLMs, it can run offline on a phone)
You are correct but I think you are missing the point.
Remember, from the perspective of all AI companies (OpenAi probably more than most), AI is this monster tech that will surely replace all workers and even your Grandma as it can bake better cookies.
This is yet another display of how lacking AI is in a simple, everyday task… but more importantly, it is a gigantic demonstration of how AI is completely blind to its own weaknesses which is what makes it really really dangerous when used as prescribed by OpenAi and the others
This situation is basically the same as when the brand new $700 iPhones (back when that was eye watering expensive for a phone) could not run an alarm in the mornings and Apple’s answer was something like “why are you using your Cadillac phone as a cheap alarm?”… it should fucking wake me up with a massage for that cost!
Oh yeah, that’s definitely a problem too. And part of the problem is that AI companies seem to be blind to the LLMs’ weaknesses as well.
minecraft is turing-complete, so, like, you can do a whole lot more than just be a timer.
Absolutely. I was thinking of getting back into minecraft Redstone but I’d rather do it in a non-Microsoft alternative. Not to mention at least a dozen other projects on my backlog
Yeah, I’d be playing minecraft, too, especially with my nephew, but uhm. that whole microslop thing ruins it for me.
We do enjoy space engineers, though. (lots of mining, lots of building. Nephew loves it when Klang accepts our sacrifices.) It’s a bit more involved than minecraft, though.
(but niftishly, there’s programable blocks that will let you write c# code and… do things.) (space engineers 2 is in early access if you have the hardware.)
Check out Luanti and Voxelibre!
Will do, thanks for the recommendations!
An open-source voxel game creation platform. Play one of our many games solo or together. Mod a game as you see fit, or make your own.
VoxeLibre is a survival sandbox game for Luanti. Survive, gather, hunt, mine for ores, build, explore, and do much more. Inspired by Minecraft, pushing beyond.
https://content.luanti.org/packages/wuzzy/mineclone2/
Mesecons! They’re yellow, they’re conductive, and they’ll add a whole new dimension to Luanti’s gameplay.
Mesecons implements a ton of items related to digital circuitry, such as wires, buttons, lights, and even programmable controllers. Among other things, there are also pistons, solar panels, pressure plates, and note blocks.
Mesecons has a similar goal to Redstone in Minecraft, but works in its own way, with different rules and mechanics.
You might like Logic World
It looks interesting, but I’m looking for something with more… world in it. Something to use my logic circuits with (count storage items, simple store, item mail system…)
People have coded Minecraft in Minecraft.
It’s the Elon strategy. Works just right when the most powerful country of the world if full of people who can’t read at 6th grade level and a bunch of psychopaths.
You would already be doing a great service to the world if you produced a really well tuned search engine / information digger with LLMs but no you had to periodically hype it as AGI because it can memorize entire text books with some accuracy. You did this to yourselves and if you fall it will be because of these expectations which are not met.
Everyone’s getting their knickers in a twist over nothing here.
Of course an AI can track time, if it’s given access to a timer MCP server.
Can we track time without tools, just in our heads? Certainly not very accurately. We can, however, track it reasonably accurately if given access to a quartz stop watch (typically +/-15 s/year)
A language model is based around language and reasoning by words/symbols. It’s not a surprise it doesn’t have timing capability.
What Altman SHOULD be embarrassed about is that the model lies about its capabilities. That implies that the context is still not right - it should be adequately trained and given context to prevent the lying. That implies a much more worrying issue - and something that Anthropic handles far better, IMHO (when asked if it can track time, if says “no, not on my own”, and then proceeds to build a JavaScript timer that it offers up to track time).



It could simply save a timestamp of the “begin timer” message and compare it to the timestamp of the “end” message. It’s not that complicated, and writing a script and executing it is overkill… It just needs access to a calculator skill.
Yes, it handles it better, but it’s still a dumb approach and waste of energy.
Aren’t we saying exactly the same? Give it an MCP server or a native skill that CAN track time.
I don’t use them but I follow the news about them loosely. The reason for this is epistemic humility. Claude has a pretty good idea of what its capabilities are and where the ceiling is. Chatgpt has no clue what its limits are so it believes it can do everything. Basically chatgpt has a lot of info and no idea where the gaps live and Claude has a fair idea when to search or use some external function to handle something. Gemini has less than Claude but more than chatgpt. Grok has little to no epistemic humility, but it did manage to accurately portray Musk as a world champion piss drinker, something none of the others were able to do.
I say that, but it’s been a few months since I looked. That could have changed because shit moves fast. By the looks of what it’s trying to do with the timer chatgpt has less than it used to. Possibly because of the way the model is trained to be helpful and confident.
well messages are clearly not stateless (otherwise there would be no context), but in general yes the issue is not the lack of capability, it’s the complete unawareness of it and the insistence on lying about it.
THIS time it is ridiculously obvious but what if it does this after checking a very large data set where there would be no (good) way to verify its answer?
This is why Ai, in it’s current form, is basically useless. If you cannot trust it NOT to lie, and must/should verify everything yourself, you might as well skip the useless step of asking
To call AI useless is quite a strong statement.
There’s a million places to use it!
The problem is that the market thinks there’s a billion places to use. And right now we’re funding 999 million places that shouldn’t be using AI but have the funding to do that dumb thing so we can figure the one million places where it makes fantastic sense.
He’s going to ask US Congress for a bailout with taxpayers money when this all fails and Congress is going to most likely give it to him because this one company is a huge part of the US economy
I don’t think so, and I’m on the Ed Zitron train of thought why not.
The financial instruments got a bailout in 08, because the economy itself would stop functioning. That’s different than the stocks would drop. Also, there’s like nothing to bail out? OpenAI and their ilk are just sucking down capital and returning nothing. Even if they get one bailout, they need a continuous stream of unlimited money forever? I don’t think it’ll happen.
I hope I’m right, cuz damn that shit is cancerous
If Trump is still in charge when the bubble pops, he’ll do everything he can to bail them out. Altman knows how to flatter people, and he’s doing that constantly with Trump. A significant part of Trump’s base is silicon valley techbros who will lose their shirts if the bubble collapses. They had enough sway to get their guy installed as the VP. Getting a bailout will be easy for them. If they get poor, they won’t be able to fund the MAGA movement.
Even if Trump isn’t in charge anymore. Businesses that have fired a lot of employees and replaced what they did with LLM slop will say their businesses will be ruined if the bubble suddenly pops, so they’ll frame it as the economy collapsing if the LLM bubble is allowed to pop. Not to mention they’ll claim it’s a national security matter because if American LLMs disappear the only ones left will be Chinese ones, and that would be a threat to national security. The fact that the military is extensively using LLMs in their bombing of Iran shows how integrated they now are into the way the military does things, and you can’t ask the military to just go back to how things were done 5 years ago!
I expect that when the LLM bubble starts to pop, there will be enormous bailouts from the government, adding tens of trillions to the US debt. That’s a long-term thing and will be someone else’s problem.
I think a potential OpenAI “bailout” should go something like this:
- The investors get their money back.
- They have to sign a pact that they must not invest into AI anymore for a given amount of years (20+ minimum).
- Massive regulatory overhaul to make sure stuff like this never happen. Also undo Ford v. Dodge Brothers.
- Scam Altman and the others go to life in prison.
… why should the investors get their money back? They invested ludicrous amounts of money into a technology with obvious limitations from the start with the intention of using that technology to replace many people’s jobs. Losing that money will be a better lesson than some probably unenforceable “pact”.
They have to sign a pact that they must not invest into AI anymore for a given amount of years (20+ minimum).
Problem is this might hurt actual AI research to punish a scam that has absolutely nothing to do with AI other than having coopted the name for marketing purposes.
(Any investment in actual AI research is doomed for decades anyway when this bubble pops, but this would cause even more harm than the bubble has already caused.)
(Also any form of research is probably ruined for decades anyway due to LLM-induced brain rot and having to sift through all the slop to try to recover any remaining fragments of actually useable knowledge, but, again, let’s not make it even worse than it already is).
Scam Altman sounds like it’s a name straight from an hltv comment section, I love it
That is genuinely hilarious!
You can’t expect it to measure time if it doesn’t have an internal clock. I’m sure it could be done if you give LLM necessary tools and system prompt
See, it does not understand time, so in order to.Vibe code in timer funxtionaloty, they need to start feeding it clocks.

















