I like it.
You've listed mostly things that countries should do out of self-interest, without much need for international cooperation. (A little bit disanalogous with the UN SDGs.) This is fine, but I think there could also be some useful principles for international regulation of AI that countries could agree to in principle, to pave the way for cooperation even in an atmosphere of competitive rhetoric.
Under Develop Safe AI, it's possible "Alignment" should be broken down into a few chunks, though I'm not sure. There's a current paradigm called "alignment" that uses supervised finetuning + reinforcement learning on large models, where new reward functions, and new ways of leveraging human demonstrations/feedback, all have a family resemblance. And then there's everything else - philosophy of preferences, decision theory for alignment, non-RL alignment of LLMs, neuroscience of human preferences, speculative new architectures that don't fit in the current paradigm. Labels might just be something like "Alignment via finetuning" vs. "Other alignment."
Under Societal Resilience to Disruption, I think "Epistemic Security Measures" could be fleshed out more. The first thing that pops to mind is letting people tell whether some content or message is from an AI, and empowering people to filter for human content / messages. (Proposals range from legislation outlawing impersonating a human, to giving humans unique cryptographic identifiers, to something something blockchain something Sam Altman.)
But you might imagine more controversial and dangerous measures - like using your own AI propaganda-bot to try to combat all external AI propaganda-bots, or instituting censorship based on the content of the message and not just whether its sender is human (which could be a political power play, or as mission creep trying to combat non-AI disinformation under the banner of "Epistemic Security," or because you expect AIs or AI-empowered adversaries to have human-verified accounts spreading their messages). I think the category I'm imagining (which may be different than the category you're imagining) might benefit from a more specific label like "Security from AI Manipulation."
Thanks for all this!
On international cooperation: I'm not sure exactly what kinds of principles you're thinking of - my first thoughts go to the UN Governing AI For Humanity Principles. If it's something like that, I'd say that they are outside the scope of the proposal above for now. However, the Standard as imagined here does aim to meaningfully support international cooperation, despite appealing to things that countries should do out of self interest:
(i) The Standard itself works as a high-level strategy for an international governance organisation. For example, an international governing body could suggest or prescribe some minimal objectives across the 3 goals or 9 subgoals for nations to achieve by a given time.
(ii) Without an international governance org, the Standard could make collaboration easier if endorsed by multiple states. Those states would have common language and objectives that would be more likely to overlap.
(iii) Even if only one nation endorses the standard, or even a city or a state, the standard includes international collaboration / cooperation as one of the possible actions under "preventing bad actors." This is in recognition of the fact that other people's TAI will impact your security. So endorsing the Standard should encourage an internationally-minded approach.
These could lead to the formulation of principles later. Not entirely sure if that addresses your point. Interested to hear more!
On alignment and and societal resilience: Thanks! These are great points, I'll definitely look into expanding on these areas as you suggest. I'm leaning away from the more controversial mitigations like AI propaganda; the standard aims to avoid controversiality where possible, and just focus on mitigations that are likely to be widely accepted as viable in principle (even if there are diverging opinions on specific details of implementation).
Greatly Transformative AI (TAI) will plausibly be created this decade. The U.S. is openly racing for AI dominance, rapidly expanding military use and explicitly evaluating frontier-model national-security risks—including cyber and CBRN capabilities. Investors and firms are pricing in large-scale deployment, while frontier labs are predicting substantial labor market disruptions. Taken together, the stakes are high across cybersecurity, biosecurity, scientific R&D, labor markets, and military systems, and governments, militaries, major companies, and AI labs are acting accordingly.
Governments and researchers need a clear, coherent overview of urgent actions to prioritise and how they relate to each other, with contingencies given success or failure of important projects with uncertain outcomes. For example, how should success or failure in creating robust frontier model evaluations determine subsequent governance actions?
We propose that an AI Governance Standard—analogous to the UN Sustainable Development Goals—can provide this strategic clarity. Drawing from a broad literature review of expert recommendations, we present an early iteration of such a standard: Three high-level goals and nine interrelated subgoals that together can constitute a theory of victory for TAI readiness.
National and international approaches to AI governance lack a shared understanding of how different governance objectives may depend on, conflict with or support others. Regulatory bodies are faced with the huge burden of reconciling a vast body of interdisciplinary recommendations into a coherent high-level AI action plan. As a result, each government is forced to re-invent the wheel from scratch, making international AI regulations more likely to diverge and rendering international collaboration more difficult. This also creates challenges to international compliance for AI developers, thereby hindering innovation.
Assuming short timelines are plausible, many jurisdictions may not be on track for TAI-readiness. An AI governance standard gives us a benchmark by which we can assess current progress towards TAI readiness in relevant jurisdictions, and internationally. Poor assessments on this benchmark could motivate the targeted, urgent action required to prevent catastrophe.
The AI Governance Standard aims to develop a shared understanding on how to approach the diverse challenges involved, based on substantial expert alignment that already exists around minimum governance standards. The Standard can provide a structured approach for prioritizing interventions, and offer governments a flexible yet actionable roadmap toward TAI readiness.
I detail the challenges a TAI Governance standard addresses, illustrate what this standard could look like with an early example, discuss some challenges and limitations, and propose next steps.
AI governance entails a vast body of recommendations, spanning a variety of often distinct domains, from technical alignment research, to incentivising lab reporting requirements, to creating international treaties on AI safety. There is very little work engaging directly with the problem of how to reconcile these recommendations into a single strategy. As a result, regulatory bodies and AI safety researchers are forced to grapple with this challenge directly on a case by case basis. There are at least four major challenges to achieving existential security that an AI Governance Standard would address:
It is likely that most or all governments are not on track to achieve success in short timelines. As of September 2025, efforts at TAI governance are largely nascent. Steps towards concrete governance in the US, arguably the most important jurisdiction for governing frontier AI, have been blocked (repeal of EO, SB-1047), and the Current US AI Action Plan emphasises accelerated AI development. The most robust AI governance legislation to date, the EU AI Act, is still facing delays and challenges. Meanwhile, the UN is establishing platforms for multilateral discussion, and a UK AI Bill is in development.
An AI governance standard can provide strong motivation for accelerated action around TAI readiness. Without a theory of victory—some detailed idea about what TAI readiness entails—it is difficult to say whether, for example, the EU AI Act is sufficient for TAI readiness. An AI Governance standard would help identify when a regulatory framework is lagging behind, and detail exactly how it needs be improved, by what year, and which viable expert recommendations may meet the need.
A standard should ultimately show how a given action relates to achieving the goal of existential security. It should also support analysis of how a given action relates to other prospective or ongoing actions, so that certain actions can be prioritized over others, or combined with others synergistically. One possible approach is to identify the set of broad, interrelated goals for TAI safety, such that achieving these goals would amount to TAI readiness, analogous to the Sustainable Development Goals for achieving peace and prosperity for people and the planet. All mitigations can then be thought of as in service to one or more of the TAI readiness goals and/or subgoals.
These assumptions were the starting point for developing a goals-based taxonomy of AI risk mitigations. The taxonomy aims to identify the broad goals of many different TAI Governance recommendations, to draw out how they relate to and overlap with each other.
In order to produce an illustrative first attempt:
The resulting groupings suggest 3 high-level goals of TAI risk mitigations are:
Within these goals, the taxonomy suggests 9 interrelated subgoals for TAI Governance.
These goals and the taxonomy could constitute the beginnings of AI Governance standard and support addressing the challenges discussed above. We also propose that a likely third element would be a more detailed mapping of mitigation interdependencies, which we propose in next steps.
View the full-sized image and associated sources here.
Arrows represent dependencies, where one element generally has significant implications for the success of the other. Boxes represent sets and subsets of goals.
To find detailed discussions of an intervention, consult the corresponding number in the Sources Table.
This framework allows a user to view most well discussed interventions on an even plane, and make decisions about which ones to select in order to meet their goals (e.g how to create regulatory visibility) in a way that fits their specific needs or context, with a selection of the relevant expert-recommended interventions.
We aim to take these steps concurrently.