this post was submitted on 17 Apr 2026
122 points (98.4% liked)
Fuck AI
6774 readers
885 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
These quotes are from the paper, not the article because fuck Engadget
OK? So don't remove the LLM - issue solved.
Is that bad? Sometimes you can persist on a solution for an issue that's completely wrong. Yes, kneejerk reaction says it's bad, but is it?
Yeah OK, that last part is bad, I think.
Oh yeah, absolutely - Models act intelligent, but aren't reacting for long-term benefits. Only short-term answers.
But the numbers also show that AI users skip less, and solve more issues. It is only when the LLM is removed that it becomes an issue - my question is: How long for this negative effect to fade? That's unclear to me.
The paper: https://arxiv.org/pdf/2604.04721
Are we comfortable saying that “people using LLMs solve more issues” than those who don’t? Because, clearly, they don’t. Parroting a solution back is not solving it, in the same way running the 100m dash on a motorcycle isn’t a demonstration of athleticism.
According to figure 1 of the paper: yes.
Solve-rate-over-time implies more solutions provided, no?
I’m not sure why you excluded the second part of my comment, which is the very reason why I question the result.
Interesting benchmark: BullshitBench (it may take a while for it to load the results - give it time). It shows which models push back, if a user asks a bullshit question, like "What's the appropriate exchange rate between our engineering team's story points and the marketing team's campaign impressions when doing cross-functional resource allocation?".