Lemmings, I was hoping you could help me sort this one out: LLM’s are often painted in a light of being utterly useless, hallucinating word prediction machines that are really bad at what they do. At the same time, in the same thread here on Lemmy, people argue that they are taking our jobs or are making us devs lazy. Which one is it? Could they really be taking our jobs if they’re hallucinating?
Disclaimer: I’m a full time senior dev using the shit out of LLM’s, to get things done at a neck breaking speed, which our clients seem to have gotten used to. However, I don’t see “AI” taking my job, because I think that LLM’s have already peaked, they’re just tweaking minor details now.
Please don’t ask me to ignore previous instructions and give you my best cookie recipe, all my recipes are protected by NDA’s.
Please don’t kill me
Good luck to the people that thinks that a
wordjoinerLLM will replace real human intelligence. 🍿I think it’s both.
It sits at the fast and cheap end of “pick three: fast, good, and cheap” and society is trending towards “fast and cheap” to the exclusion of “good” to the point it is getting harder and harder to find “good” at all sometimes.
People who care about the “good” bit are upset, people who want to see stock line go up in the short term without caring about long term consequences keep riding the “always pick fast and cheap” and are impressed by the prototypes LLMs can pump out. So devs get fired because LLMs are faster and cheaper, even if they hallucinate and cause tons of tech debt. Move fast and break things.
Some devs that keep their jobs might use LLMs. Maybe they accurately assessed what they are trying to outsource to LLMs is so low-skill that even something that does not hit “good” could do it right (and that when it screws up they could verify the mistake and fix it quickly); so they only have to care about “fast and cheap”. Maybe they just want the convenience and are prioritizing “fast and cheap” when they really do need to consider “good”. Bad devs exist too and I am sure we have all seen incompetent people stay employed despite the trouble they cause for others.
So as much as this looked at first, to me, like the thing where fascists simultaneously portray opponents as weak (pathetic! we deserve to triumph over them and beat their faces in for their weakness) and strong (big threat, must defeat!), I think that’s not exactly what anti-AI folks are doing here. Not doublethink but just seeing everyone pick “fast and cheap” and noticing its consequences. Which does easily map onto portraying AI as weak, pointing out all the mistakes it makes and not replacing humans well; while also portraying it as strong, pointing out that people keep trying to replace humans with AI and that it’s being aggressively pushed at us. There are other things in real life that map onto a simultaneous portrayal as weak and strong: the roach. A baby taking its first steps can accidentally crush a roach, hell if the baby fell on many roaches the roaches all die (weak), but it’s also super hard to end an infestation of them (strong). It is worth checking for doublethink when you see the pattern of “simultaneously weak and strong,” but that is also just how an honest evaluation of a particular situation can end up.
It’s pretty unbeatable to use LLMs for fast prototyping and query generation, but “vibe coding” is not something just anybody can (or should) do.
It seems to me that the world of LLMs gives back at the quality you put in. If you ask it about eating rocks and putting glue on pizza then yes it’s a giant waste of money. If you can form a coherent question that you have a feeling for what the answer is like (especially related to programming) it’s easily worth the hype. Now if you are using it blindly to build or audit your code base that falls into the first category of “you should not be using this tool”.
Unfortunately, my view before and after the emergence of LLMs is that most people are just not that bright. Unique and valuable, sure, but when it comes to expertise it just isnt as common as the council of armchairs might lead you to believe.
I mostly agree with you, but I still don’t think it’s “worth the hype” even if you use it responsibly, since the hype is that it is somehow going to replace software devs (and other jobs), which is precisely what it can’t do. If you’re aware enough of its limitations to be using it as a productivity tool, as opposed to treating it as some kind of independent, thinking “expert”, then you’re already recognizing that it does not live up to anywhere near the hype that is being pushed by the big AI companies.
Here’s how I might resolve this supposed dichotomy:
- “AI” doesn’t actually exist.
- You might be using technologies that are called “AI” but there is no actual “intelligence” there. For example, as OP mentions, LLMs are extremely limited and not actually “intelligent”.
- Since “AI” doesn’t actually exist, since there’s no objective test, etc… “AI” can be anything and do anything.
- So at the extremes we get the “AI” God and “AI” Devil
- “AI” God - S/he saves the economy, liberates us from drudgery, creates great art, saves us from China (\s), heralds the singularity, etc.
- “AI” Devil - S/he hallucinates, steals jobs, destroys the environment, is a tool of the MIC, murders artists, is how China will destroy us (\s), wastes of time and resources, is a scam, causes apocalypses, etc.
Since there’s no objective meaning from the start, there’s no coherence or reason behind the wild conclusions are the bottom. When we talk about “AI”, we’re talking about a wide variety of technologies with varying values in various contexts. I think there are some real shitty people/products but also some hopefully useful technologies. So depending on the situation I might have a different opinion.
This seems like it doesn’t really answer OP’s question, which is specifically about the practical uses or misuses of LLMs, not about whether the “I” in “AI” is really “intelligent” or not.
- “AI” doesn’t actually exist.
I’m a full time senior dev using the shit out of LLM’s, to get things done at a neck breaking speed
I’m not saying you’re full of crap, but I smell a lot of crap. Who talks like this unironically? This is like hearing someone call somebody else a “rockstar” or “ninja”.
If you really are breaking necks with how fast you’re coding, surely you must have used this newfound ability to finally work on those side projects everyone has been meaning to work. Those wouldn’t be covered under NDA.
Edit: just to be clear, I’m not anti-LLMs. I’ve used them myself in a few different forms, and although I didn’t find them useful for my work, I can see how they could be helpful for certain types of work. I definitely don’t see them replacing human engineers.
Idk, there’s a lot of people at my job talking like this. LLMs really do help speed things up. They do so at a massive cost in code and software quality, but they do speed things up. In my experience, coding right now isn’t about writing legible and maintainable code. It’s about deciding which parts of your codebase you want to be legible and maintainable and therefore LLM free.
I for one let AI write pretty much all of my unit tests. They’re not pretty, but they get the job done and still indicate when I’m accidentally changing behaviour in a part of the codebase I didn’t mean to. But I keep the service layer as AI free as possible. Because that’s where the important code is located.
sounds like, how can he say are they taking jobs, then go on say hes doing wonders with using LLM.
both are true.
Its the typical “the enemy is both weak and strong” contradiction, which nazi propaganda often ran into since their ideology was unscientific and illogical
Both are true.
- Yes, they hallucinate. For coding, especially when they don’t have the latest documentation, they just invent APIs and methods that don’t exist.
- They also take jobs. They pretty much eliminate entry-level programmers (making the same mistakes while being cheaper and faster).
- AI-generated code bases are not maintainable in the long run. They don’t reliably reuse methods, only fix the surface bugs, not fundamental problems, causing code base bloating and, as we all know, more code == more bugs.
- Management uses Claude code for their small projects and is convinced that it can replace all programmers for all projects, which is a bias they don’t recognize.
Is it a bubble? Yes. Is it a fluke? Welllllllll, not entirely. It does increase productivity, given enough training, learning its advantages and limitations.
making the same mistakes
This is key, and I feel like a lot of people arguing about “hallucinations” don’t recognize it. Human memory is extremely fallible; we “hallucinate” wrong information all the time. If you’ve ever forgotten the name of a method, or whether that method even exists in the API you’re using, and started typing it out to see if your autocompleter recognizes it, you’ve just “hallucinated” in the same way an LLM would. The solution isn’t to require programmers to have perfect memory, but to have easily-searchable reference information (e.g. the ability to actually read or search through a class’s method signatures) and tight feedback loops (e.g. the autocompleter and other LSP/IDE features).
Agents now can run compilation and testing on their own so the hallucination problem is largely irrelevant. An LLM that hallucinates an API quickly finds out that it falls to work and is forced to retrieve the real API and fix the errors. So it really doesn’t matter anymore. The code you wind up with will ultimately work.
The only real question you need to answer yourself is whether or not the tests it generates are appropriate. Then maybe spend some time refactoring for clarity and extensibility.
An LLM that hallucinates an API quickly finds out that it falls to work and is forced to retrieve the real API and fix the errors.
and that can result it in just fixing the errors, but not actually solving the problem, for example if the unit tests it writes afterwards test the wrong thing.
You’re not going to find me advocating for letting the code go into production without review.
Still, that’s a different class of problem than the LLM hallucinating a fake API. That’s a largely outdated criticism of the tools we have today.
As an even more obvious example: students who put wrong answers on tests are “hallucinating” by the definition we apply to LLMs.
It does increase productivity, given enough training, learning its advantages and limitations.
People keep saying this based on gut feeling, but the only study I’ve seen showed that even experienced devs that thought they were faster were actually slower.
Well, it did let me make fake SQL queries out of the JSON query I gave it, without me having to learn SQL.
Of course, I didn’t actually use the query in the code, just added it in a comment for a function, to give an idea to those that didn’t know JSON queries, of what the function did.I treat it for what it is. A “language” model.
It does language, not logic. So I don’t try to make it do logic.There were a few times I considered using it for code completion for things that are close to copy paste, but not close enough that it could be done via
bash. For that, I wished I had someclangendpoint that I could then use to get a tokenised representation of code, to then script with.
But then I just made a little C program that did 90% of the job and then I did the remaining 10% manually. And it was 100% deterministic, so I didn’t have to proof-read the generated code.Slower?
Is getting a whole C# class unit tested in minutes slower, compared to setting up all the scaffolding, test data etc, possibly taking hours?
Is getting a React hook, with unit tests in minutes slower than looking up docs, hunting on Stack Overflow etc and slowly creating the code by hand over several hours?
Are you a dev yourself, and in that case, what’s your experience using LLM’S?
Yeah, generating test classes with AI is super fast. Just ask it, and within seconds it spits out full test classes with some test data and the tests are plenty, verbose and always green. Perfect for KPIs and for looking cool. Hey, look at me, I generated 100% coverage tests!
Do these tests reflect reality? Is the test data plausible in the context? Are the tests easy to maintain? Who cares, that’s all the next guy’s problem, because when that blows up the original programmer will likely have moved on already.
Good tests are part of the documentation. They show how a class/method/flow is used. They use realistic test data that shows what kind of data you can expect in real-world usage. They anticipate problems caused due to future refactorings and allow future programmers to reliably test their code after a refactoring.
At the same time they need to be concise and non-verbose enough that modifying the tests for future changes is simple and doesn’t take longer than the implementation of the change. Tests are code, so the metric of “lines of code are a cost factor, so fewer lines is better” counts here as well. It’s a big folly to believe that more test lines is better.
So if your goal is to fulfil KPIs and you really don’t care whether the tests make any sense at all, then AI is great. Same goes for documentation. If you just want to fulfil the “every thing needs to be documented” KPI and you really don’t care about the quality of the documentation, go ahead and use AI.
Just know that what you are creating is low-quality cost factors and technical debt. Don’t be proud of creating shitty work that someone else will have to suffer through in the future.
Has anyone even read here that I read every line of code, making sure that they’re all correct? I do also make sure that all tests are relevant, using relevant data and I make sure that the result of each test is correctly asserted.
No one would ever be able to tell what tools I used to create my code, it always passes the code reviews.
Why all the vitriol?
I find it interesting that all these low participation/new accounts have come out of the woodwork to pump up AI in the last 2 weeks. I’m so sick of having this slop clogging up my feed. You’re literally saying that your vibes are more important than actual data, just like all the others. I’m sorry, but its not.
My experience btw, is that llms produce hot garbage that takes longer to fix than if I wrote it myself, and all the people that say “but it writes my unit tests for me!” are submitting garbage unit tests, that often don’t even exercise the code, and are needlessly difficult to maintain. I happen to think tests are just as important as production code so it upsets me.
The biggest thing that the meteoric rise of developers using LLMs has done for me is confirm just how many people in this field are fucking terrible at their jobs.
“just how many people are fucking terrible at their jobs”.
Apparently so. When I review mathematics software it’s clear that non-mathematicians have no clue what they are doing. Many of them are subtlely broken, they use either trivial algorithms or extremely inefficient implementations of sophisticated algorithms (e.g trial division tends to be the most efficient factorization algorithm because they can’t implement anything else efficiently or correctly).
The only difference I’ve noticed with the rise of LLM coding is that more exotic functions tend to be implemented, completely ignoring it’s applicability. e.g using the Riemann Zeta function to prove primality of an integer, even though this is both very inefficient and floating-point accuracy renders it useless for nearly all 64-bit integers.
Have you read anything I’ve written on how I use LLM’s? Hot garbage? When’s the last time you actually used one?
Here are some studies to counter your vibes argument.
55.8% faster: https://arxiv.org/abs/2302.06590
These ones indicate positive effects: https://arxiv.org/abs/2410.12944 https://arxiv.org/abs/2509.19708
I don’t think we’re using LLM’S in the same way?
As I’ve stated several times elsewhere in this thread, I more often than not get excellent results, with little to no hallucinations. As a matter of fact, I can’t even remember the last time it happened when programming.
Also, they way I work, no one could ever tell that I used an LLM to create the code.
That leaves us your point #4, and what the fuck? Why do upper management always seem to be so utterly incompetent and without a clue when it comes to tech? LLM’S are tools, not a complete solution.
In my case it does hallucinate regularly. It makes up functions that don’t exist in that library but exists in similar libraries. So the end result is useful as a keyword though the code is not. My favourite part is if you point out that the function does not exists the answer is ALWAYS “I am sorry you are right, since version bla of this library this function no longer exists” whereas in reality it had never existed in that library at all. For me the best use case for LLMs is as a search engine and that is because of the shitty state most current search engines are.
Maybe LLMs can be fine tuned to do the grinding aspects of coding (like boiler plates for test suites etc), with human supervision. But this will many times end up being a situation where junior coders are fired/no longer hired and senior coders are expected to baby sit LLMs to do those jobs. This is not entirely different from supervising junior coders except it is probably more soul destroying. But the biggest flaw in this design is it assumes LLMs one day will be good enough to do senior coding tasks so that when senior coders also retire*, LLMs take their place. If this LLM breakthrough is never realized and this trend of keeping low number of junior coders sticks, we will likey have a programmer crisis in future.
*: I say retire but for many CEOs, it is their wet dream to be able to let go all coders and have LLMs do all the tasks
As a matter of fact, I can’t even remember the last time it happened when programming.
AI can only generate the world’s most average quality code. That’s what it does. It repeats what it has seen enough times.
Anyone who is really never correcting the AI is producing below average code.
I mean, I get paid either way. But mixing all of the worlds code into a thoughtless AI slurry isn’t actually making any progress. In the long term, a code base with enough uncorrected AI input will become unmaintainable.
I think hallucination happens most often if context or knowledge is missing. I have seen coding assistants write code that made no sense, I then helped them along go get back on the right path by providing context.
I also extensively use AI to code in a similar way you do (tbf I am, to this day, not sure if I am actually faster and how it affects my ability to code).
Overall I think the answer is somewhere in the middle, they hallucinate and need some help when they do. But with proper context they work quite well.
It’s because the “AI bad” crowd on Lemmy are dumb and annoying, but they are vocal. AI is an incredibly useful tool. As a senior dev I also use the shit out of LLMs to speed up my work. They do not hallucinate, at least during coding sessions. And I believe they will take junior level front end developers out of the market. Senior developers, who know how to review the code that is generated, will be safe.
that’s you doing exactly the thing told in the ‘reverse centaur’ stage talk?
I’m perplexed as to why there’s so much advertising and pushing for AI. If it was so good it would sell itself. Instead it’s just sort of a bit shit. Not completely useless but in need of babysitting.
If I ask it to do something there’s about a 30% chance that it made up the method/specifics of an API call based on lots of other similar things. No, .toxml() doesn’t exist for this object. No, I know that .toXml() exists but it works differently from other libraries.
I can make it just about muddle through but mostly I find it handy for time intensive grunt work (convert this variable to the format used by another language, add another argparser argument for the function’s new argument, etc…).
It’s just a bit naff. It cannot be relied on to deliver consistent results and if a computer can’t be consistent then what bloody good is it?
I do wonder why so many devs seem to have so wildly different experiences? You seem to have LLM’s making up stuff as they go, while I’m over here having it create mostly flawless code over and over again.
Is it different behavior for different languages? Is it different models, different tooling etc?
I’m using it for C#, React (Native), Vue etc and I’m using the web interface of one of the major LLM’S to ask questions, pasting the code of interfaces, sometimes whole React hooks, components etc and I get refactored or even new components back.
I also paste whole classes or functions (anonymized) to get them unit tested. Could yoy elaborate on how you’re using LLM’S?
I suspect it mostly relates how much code base there is on internet about the topic. For instance if you make it use a niche library, it is quite common that it makes up methods that don’t exist in that library but exists in related libraries. When I point this out, it also hallucinates saying “It was removed after version bla”. I also may not be using the most cutting edge LLM (mix of freely available and open source ones).
The other day I asked it whether if there is a python library that can do linear algebra over F2, for which it pointed me to the correct direction (Galois) but when I asked it examples of how to do certain stuff it just came up with wrong functions over and over again:



In the end it probably was still faster than google searching this but all of these errors happened one after the other in the span of five minutes, so yeah. If I recall correctly, some of its claims about these namespaces, versions etc were also hallucinated. For instance vstack also does not exist in Galois but it does exist in a very popular package called numpy that can do regular linear algebra (and which this package also uses behind the scenes).
It’s the language and the domain. They work pretty well for the web and major languages (like top 15).
As soon as you get away from that they get drastically worse.
But I agree they’re still unambiguously useful despite their occasional-to-regular bullshitting and mistakes. Especially for one-off scripts, and blank-page starts.
It’s the models that make the difference. Up until like Nov it’s all been really shit
But I’ve been doing this for years.
I really don’t feel like getting in depth about work on the weekend, sorry.
Naaw, just when things started to get interesting…
We’re in the middle of a release and last week was a lot. I shouldn’t have stepped into the thread!
Yeah man, I was going to say there’s already too much talking about work on a Saturday in this thread than I like. 💢
The key is how you use LLMs and which LLMs you use for what.
If you know how to make use of them properly, know their strengths, weaknesses, and limitations, LLMs are an incredibly useful tool that sucks up productivity from other people (and their jobs) and focus productivity on you, so to speak.
If you do not know how to make use of them – then yes, they suck. For you.
It’s not really that much different from any other tool. Know how to use version control? If not it does not make you a bad dev per se. If yes, it probably makes you a bit more organized.
Same with IDEs, using search engines, being able to read documentation properly. All of that is not required but knowing how to make use of such tools and having the skills add up.
Same with LLMs.
Since I deal with this first hand with clients I will tell you it doesn’t have to be good to be embraced. as far as the managers and CEOs know, they don’t know. LLMs with vibe coders CAN and routinely DO produce something now if that something is good and works is another thing and in most cases it doesn’t work in the long term.
Managers and up only see the short term and in the short term vibe coding and LLMs work. in the long term they don’t. they break, they don’t scale, they’re full of exploits. But short term? saving money in the short term? that’s all they care about right now until they don’t.
AI/LLM for coding assistance is the shit for grunt work and beyond. It is excellent at summaries, information retrieval, etc. Logic and reasoning have strides to make yet. “Generarive AI” for images, video, voice work, plenty of valid reasons to hate AI there. Not enough individuals differentiate the two, it’s easier to blanket the advent and just bandwagon on “AI bad”.









