Against Moloch

Monday AI Radar #24

Two thresholds loom on the horizon, with only a brief window of opportunity to prepare for each. On the technical front, it is plausible that we might see full automation of AI R&D this decade. Capabilities will move fast once that happens: our best chance for a good outcome is...

May 610

Monday AI Radar #23

If you pay close attention to this newsletter, you’ll notice that something is missing. Anthropic and OpenAI are everywhere, but Google DeepMind is largely absent. We have a profile of Demis Hassabis, an also-ran mention in Prinz’s review of the race to RSI, and some complaints about Gemini’s character from...

Apr 286

Monday AI Radar #22

How are we doing on solving the alignment problem? Harry Law begins this week’s newsletter with an explanation of alignment-by-default: the idea that because LLMs are trained on an immense body of human text, they are predisposed to understand and pursue human values. But predisposition isn’t enough: Ryan Greenblatt argues...

Apr 2117

Who I Follow

I spend several hours a day trying to keep up with what’s going on in the parts of AI that I’m interested in. It’s a ridiculous amount of work: I don’t recommend it unless you’re doing something silly like writing a newsletter about AI. But if you’d like to keep...

Apr 1917

Don't Cut Yourself on the Jagged Frontier

(With apologies to Sean Herrington, who deserves a better playwright than yours truly) A conversation with a friend on the bus to Bodega Bay today made me realize that there are some holes in my thinking about safety and superintelligence. I’ve assumed that superintelligence is by definition robustly better than...

Apr 1825

Monday AI Radar #21

This week’s big story is the limited release of Claude Mythos Preview. The headline is that Mythos is alarmingly good at cybersecurity, with the ability to find and exploit critical vulnerabilities en masse. Anthropic is handling that responsibly, but the next year or two will be challenging for security. If...

Apr 137

Quick Thoughts About Mythos

I expect it’ll take another week or two for everyone to fully digest the significance of Claude Mythos Preview. In the meantime, here are my initial thoughts. Gradually, then suddenly Mythos is radically better at cyber than any previous model: It isn’t the first model that can find vulnerabilities, of...

Apr 1118

Against Moloch

Against Moloch

Ads, Incentives, and Destiny

Don't Cut Yourself on the Jagged Frontier

Writing With Robots

Foundational Beliefs

Against Moloch

Ads, Incentives, and Destiny

Don't Cut Yourself on the Jagged Frontier

Writing With Robots

Foundational Beliefs

Monday AI Radar #24

Monday AI Radar #23

Monday AI Radar #22

Who I Follow

Don't Cut Yourself on the Jagged Frontier

Monday AI Radar #21

Quick Thoughts About Mythos