Anthropic already applied some CBRN filtering to Opus 4, with the intent to bring it below Anthropic's ASL-3 CBRN threshold, but the model did not end up conclusively below that threshold. Anthropic looked into whether they could bring more capable future models below the ALS-3 Bio threshold using pretraining data filtering, and determined that it would require filtering out too much biology and chemistry knowledge. Jerry's comment is about nerfing the model to below the ASL-3 threshold even with tool use, which is a very low bar compared to frontier model capabilities. This doesn't necessarily apply to sabotage or ability to circumvent monitoring, which depends on un-scaffolded capabilities.
Redwood research did very similar experiments in 2022, but didn't publish about them. They are briefly mentioned in this podcast: https://blog.redwoodresearch.org/p/the-inaugural-redwood-research-podcast.
In this range of code lengths, 400-1800 lines, lines of code does not correlate with effort imo. It only takes like 1 day to write 1800 lines of code by hand. The actual effort is dominated by thinking of ideas and huge hyperparameter sweeps.
Another note: i was curious what they had to do to reduce torch startup time and such, and it turns out they spend 7 minutes compiling and warming up for their 2 minute training run lmao. That does make it more realistic but is a bit silly.
There's obviously no truth to that claim. Labs absolutely have better models.
Claude 4.5 sonnet is relatively bad at vision for a frontier model. Gemini 3 pro, GPT 5.2, and Claude 4.5 opus are better.
Saying you would 2 box in Newcomb's problem doesn't make sense. By saying it aloud, you're directly causing anyone who you can influence and who would later have an opportunity to casually cooperate with you to not do that. If you believe acausal stuff is possible, you should always respond with "1 box" or "no comment", even if you would in reality 2-box.
I support a magically enforced 10+ year AGI ban. It's hard for me to concretely imagine a ban enforced by governments, because it's hard to disentangle what that counterfactual government would be like, but I support a good government enforced AGI slowdown. I do like it when people shout doom from the rooftops though, because it's better for my beliefs to be closer to global average average, and the global discourse is extremely far from overshooting doominess.
Yeah it goes out of its way to say the opposite, but if you know Nate and Eliezer the book gives the impression that their pdooms are still extremely high, and responding to the author's beliefs even when those aren't exactly the same as the text is sometimes correct, although not really in this case.
If you have a lump of 7,000 neurons, they can each connect to each other neuron, and you can spherical-cow approximate that as a 7000x7000 matrix multiplication. That matrix multiplication will all happen within O(1) spikes, 1/100 of a second. That's ~700GFlop. An H100 GPU takes ~1 millisecond to do that operation, or 1M cycles, to approximate one brain spike cycle! And the gpu has 70B or whatever transistors, so it's more like 10M transistors per neuron!
I would guess that 96+% of code merged into lab codebases is read by a human, but I've heard some startups are <50%, and are intentionally accumulating tons of tech debt / bad code for short term gain. People write lots of personal scripts with AI,which aren't in the merged/production code statistics, maybe that brings it to 85% human-read code in labs. There's also a wide range in how deeply you read and review code, where maybe only 50% of code that's read is fully understood.