Alex Amadori's Shortform

Alex Amadori

This is a special post for quick takes by Alex Amadori. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

I’m glad that me and my colleagues at ControlAI, Andrea and Gabe, managed to publish our posts (The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably, and Anthropic did not call for a pause) just a few hours before Dario Amodei published his essay Policy on the AI Exponential, because it’s quite topical.

Copy pasting my response thread on X seen through the lens of The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably:

This does not address catastrophic risks, and fails all three checks for a plan to address catastrophic risks.

Development, not deployment, of powerful AI needs to be restricted at a global level if we are to survive ASI.

The essay gestures vaguely at loss-of-control and seizures of power, but it's not enough. And it doesn't address the risk of war at all.

What do you think will happen if, as Dario suggests, a coalition of democracies bands together to get the lead on ASI? We will still be in a situation where everyone has to cut corners on safety to win the race, and we likely end up with an ASI killing everyone.

And whoever is losing still has a reason to attack preemptively with all their military might before the coalition gets a decisive advantage.

With respect to the post from my colleagues Andrea and Gabe Anthropic did not call for a pause, here’s a good comment from Nate Soares, that I think highlights how AI companies are heavily hedging to play both sides of the PR game:

In contrast with the last Anthropic blog post, Dario's new one is back to softpedaling: five big subsections about "positive impact" and "securing leadership by democracies", with one throwaway line on "loss of control of AI systems" buried deep.

Could you explain why it is softpedaling with ONE throwaway line on loss of control? Amodei called for this:

Amodei on audits of AI systems

However, now the risks are clearly here. It is time to go beyond transparency to more serious and binding regulation of AI. I believe the best analogy, at least at the current stage of the exponential, is to cars, airplanes, or drugs—powerful technologies essential to the modern economy, but capable of killing large numbers of people if designed or operated poorly. I therefore believe we should model AI regulation on agencies like the Federal Aviation Administration (FAA). Frontier AI models, like airplanes, should be required to go through technical testing and auditing, and their release should be blocked or reversed as a threat to public safety if they do not meet high standards of safety. I am grateful to see the Trump administration’s Executive Order move incrementally towards a greater role for government in AI, though Anthropic’s proposal recommends even further action. Our proposal includes the following elements:

Models above a threshold of compute should undergo mandatory testing by a qualified third party for their level of risk in four specific areas: cybersecurity, biological weapons, loss of control of AI systems, and automated R&D that could accelerate these other risks.
The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks. This power must be scoped to the above four specific risks and there must be protective measures against political favoritism or arbitrary decisions.
Third-party evaluation could be done by a government agency (similar to the FAA) or a set of private organizations that are authorized and inspected by the government to evaluate models according to certain standards (a “regulatory markets” approach).
AI companies that develop advanced AI models must have strong security standards that protect their model weights, should conduct regular red teaming and penetration testing, and should work with the government to defend against major threat actors.
Safety incidents in the four critical areas must be reported promptly.

What did Amodei miss, except for the ability of internally deployed models to follow Agent-4's path from AI-2027?

Responding really quickly, only have time to give a couple of thoughts instead of a well thought out answer.

Well, for one, he missed "the ability of internally deployed models to follow Agent-4's path from AI-2027".

He also completely fails to address the filters one and three from this post The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably, which are:

If any adversarial country can't get assurance that you are not building ASI, they will start an existential war with you.
Even if Dario's plan succeeds on its own terms, it is about creating a singleton, and I don't fucking trust him to create a singleton (nor do I trust any group of countries to do so, in the current state of the world)