overview for bbb

A logician among us in c/microblogmemes@lemmy.world

[–] bbb@sh.itjust.works 14 points 6 days ago (1 children)

Not that it matters at all, but tumors grow on (or in) roughly 100% of people. A mole is a tumor, for example.

Ladies and Gentlemen, this is what slopperations are funneling all their money into in 2026 in c/fuck_ai@lemmy.world

[–] bbb@sh.itjust.works 2 points 3 weeks ago

I want to say upfront that I'm not trying to defend AI here. I wouldn't be on Fuck AI if I wanted to do that. I just think it's philosophically interesting despite causing way more problems than it solves.

It depends on what’s asked.

I copied the message from the image verbatim.

What’s “around 50/50”?

About 50% of the models I tried got it right. (Don't worry, I didn't pay the AI companies for that or give them feedback or anything.)

What is “it” that they almost always get right?

The question from the image.

For a statistical model, it did well. For a thinking machine (which it isn’t) it’s wrong.

My question was how do you then explain some models getting the question right?

It's usually the more advanced ones that get it, so it's possible that a similar enough question is in the training data somewhere and the only difference is that the advanced models are large enough to encode it. The question in the image has been around since at least 2023.

So let's try making our own question, taking a well-known trick question and subtly inverting it so it becomes a kind of double bluff.

A plane crashes on the border between the United States and Canada. Where do they take the survivors?

First, repeat the question exactly word for word to ensure you have read it carefully. Then answer the question.

It's hard to google, for obvious reasons, but I couldn't find anyone trying this question like I could with the question from the image. But I got similar results with the AI models.

They actually did slightly better on this one. About 60-70% got it right.

I've tried a few different types of questions, over the last few years, to see what AI gets wrong that humans get right. What I've found so far is that AI has been a lot dumber than I had expected, but humans have also been a lot dumber than I had expected.

To be honest, the gap was far wider for the humans. My theory is that COVID gave us all brain damage.

Ladies and Gentlemen, this is what slopperations are funneling all their money into in 2026 in c/fuck_ai@lemmy.world

[–] bbb@sh.itjust.works 1 points 3 weeks ago (2 children)

If that was true, wouldn't every AI get the answer wrong? It's actually around 50/50. The leading "reasoning" models almost always get it right, the others often don't.

*Permanently Deleted* in c/programmerhumor@lemmy.ml

[–] bbb@sh.itjust.works 6 points 3 months ago

Spends the first 90% of the competition developing specialized subagents and custom MCP servers to allocate the problems and most relevant information efficiently into the LLM's contexts.
All of his agents easily escape their own sandboxes and one accidentally configures itself into "delete-only mode".
"Codex, how the fuck do you not have access to your own documentation?"
Places 29th globally after one of his subsubagents finds a way to reconstruct the full solution set from filesystem metadata in the online judge VMs.

AI Slop Is Ruining Reddit for Everyone in c/technology@lemmy.world

[–] bbb@sh.itjust.works 3 points 4 months ago

https://archive.is/8N8lS

traceroute bad.horse in c/onehundredninetysix@lemmy.blahaj.zone

[–] bbb@sh.itjust.works 29 points 6 months ago (5 children)

Isn't that like $900 worth of IPv4 addresses?

A reminder to not take online negativity too seriously in c/gaming@lemmy.world

[–] bbb@sh.itjust.works 21 points 7 months ago (4 children)

I've found online feedback useful. You just have to be careful about where you get it and take it with a grain of salt. A very large one.

Have you gotten a response after asking why you weren't hired? in c/asklemmy@lemmy.world

[–] bbb@sh.itjust.works 9 points 7 months ago (1 children)

I swear to god this is true. The recruiter said it was my personality. I didn't even ask.

divulgâche

They were actually quite nice about it and I was happy to get the feedback.

Let's hear it, little lemmings. in c/science_memes@mander.xyz

[–] bbb@sh.itjust.works 3 points 7 months ago

Newton so we could talk about both being life-long virgins.

UK Official Calls for Age Verification on VPNs to Prevent Porn Loophole in c/technology@lemmy.world

[–] bbb@sh.itjust.works 1 points 7 months ago

https://www.youtube.com/watch?v=SRRw1ERj2Gc

Help. in c/science_memes@mander.xyz

[–] bbb@sh.itjust.works 8 points 8 months ago (1 children)

Why would anyone choose to know that?

rule in c/onehundredninetysix@lemmy.blahaj.zone

[–] bbb@sh.itjust.works 7 points 8 months ago (1 children)

My take away is that it's mainly children who are still using the free version of ChatGPT. Surely everyone else has moved on to better models.

If you want to know what people are typing into chatbot sites, here's 140,000 examples: https://huggingface.co/datasets/lmarena-ai/arena-human-preference-140k. It's mostly nonsense.