this post was submitted on 11 Apr 2026
13 points (76.0% liked)
Videos
18450 readers
283 users here now
For sharing interesting videos from around the Web!
Rules
- Videos only (aside from meta posts flagged with [META])
- Follow the global Mastodon.World rules and the Lemmy.World TOS while posting and commenting.
- Don't be a jerk
- No advertising
- No political videos, post those to !politicalvideos@lemmy.world instead.
- Avoid clickbait titles. (Tip: Use dearrow)
- Link directly to the video source and not for example an embedded video in an article or tracked sharing link.
- Duplicate posts may be removed
- AI generated content must be tagged with "[AI] …" ^Discussion^
Note: bans may apply to both !videos@lemmy.world and !politicalvideos@lemmy.world
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
In their paper, they post keys that can be verified once the vulnerabilities are patched (so they aren't just revealing exploitable issues to the world) but in the few that they demonstrated (ones that were quickly patched), it demonstrated a pretty sophisticated ability to find and exploit multiple vulnerabilities. The patches that you saw them mention are a direct result of Anthropic reporting those vulnerabilities.
The method they talk about is basically saying that they weren't looking at old, patched code (which would mean that the model could have found vulnerability mentions on the web that others have pointed out) but rather current, actively used software. The vulnerabilities and exploits that the model found were novel, zero day (meaning as of yet they 'undiscovered' problems by the person/people being attacked).
I'm not a researcher though, so someone can correct any information I've gotten wrong here, but this is definitely not solely hype. It's not exciting stuff (unless you just look at headlines) but the vulnerabilities they discovered are like actual problems, especially if a model like this gets into the hands of bad actors.
Ah thanks, I didn't find their paper but you lead me on the correct path to find some nice info on their blog! Great idea with the keys they had, it's good that we will be able to verify if their claims are true in the future at least. The bugs that were solved already did indeed seem cool, but they write the blog in a slightly odd day where I didn't find the confirmation that those were also zero-day vulnerabilities. Either way, we should get plenty of confirmation with the keys. Thanks for the details!