People Twitter

10058 readers

512 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

Mark NSFW content.
No doxxing people.
Must be a pic of the tweet or similar. No direct links to the tweet.
No bullying or international politcs
Be excellent to each other.
Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician. Archive.is the best way.

founded 3 years ago

MODERATORS

SendMeYourTaTas@sh.itjust.works

pelespirit@sh.itjust.works

955

Managers (media.piefed.zip)

submitted 1 week ago* (last edited 1 week ago) by inari@piefed.zip to c/whitepeopletwitter@sh.itjust.works

179 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] theunknownmuncher@lemmy.world 1 points 1 week ago (1 children)

https://github.com/QwenLM/Qwen3.6#benchmarks

[+] AtHeartEngineer@lemmy.world 0 points 1 week ago* (last edited 1 week ago) (2 children)

[deleted]

[–] theunknownmuncher@lemmy.world 1 points 1 week ago* (last edited 1 week ago) (2 children)

"I don’t think any of that is true. show me data" is shown data "I won't accept that data!" Lol. Lmao even.

Yeah, I'm not going to play this game of trying to anticipate which numbers you're willing to accept and which you aren't. You have just as equal access to a search engine as I have. All of the results I have seen align with the numbers that Qwen released and are well within margins of error.

This model's release caused such a stir and was a big deal due to the fact that it reproducibly meets or beats Claude Opus 4.5 while being locally runnable. If you won't believe it, okay, I don't care. 🤷

[–] theunknownmuncher@lemmy.world 0 points 1 week ago* (last edited 1 week ago)

It's not like the Qwen team hasn't already built a lot of trust with the community. They've never been misleading with previous releases, the "marketing material" (🙄) is for a free product, so they have no incentive to lie, and it would be extra stupid because anyone can run the benchmarks and verify their numbers independently anyway. What would be the point?