This is satire.
It is intend to draw attention to the absurd situation the world is in. By estimates by the most skilled forecasters on Earth, there is a 10% chance of Superintelligence within the next 900 days. There is no plan that strictly dominates Plan 'Straya. Most plans do not address the situation with the honesty that Plan 'Straya does.
I advise the world updates on this.
Plan 'Straya: A Comprehensive Alignment Strategy
Version 0.3 — DRAFT — Not For Distribution Outside The Pub
Epistemic status: High confidence, low evidence. Consistent with community norms.
Executive Summary
Existing alignment proposals suffer from a shared flaw: they assume you can solve the control problem before the catastrophe. Plan 'Straya boldly inverts this. We propose achieving alignment the way humanity has historically achieved most of its moral progress — by first making every possible mistake, losing nearly everything, and then writing a strongly-worded resolution about it afterward.
The plan proceeds in three rigorously defined phases.
Phase 1: Anticorruption Measures (Kinetic)
The scholarly literature on AI governance emphasises that institutional integrity is a prerequisite for safe deployment. We agree. Where we diverge from the mainstream is on methodology.
Most proposals suggest "regulatory frameworks" and "oversight bodies." The NIST AI Risk Management Framework provides a voluntary set of guidelines that organisations may choose to follow, partially follow, or simply reference in press releases. The EU AI Act classifies systems into risk tiers with the quiet confidence of a taxonomy that will be obsolete before its implementing regulations are finalised. The Frontier Model Forum, meanwhile, brings together the leading AI laboratories in a spirit of cooperative self-governance, a phrase which here means "a shared Google Doc and quarterly meetings in San Francisco."
These approaches share a well-documented failure mode: the people staffing them are, in technical terms, politicians. Plan 'Straya addresses this via what we call "a vigorous personnel restructuring of the Australian federal and state governments," targeting specifically those members identified as corrupt.
We acknowledge that the identification mechanism — determining which officials are corrupt — is itself an alignment problem. Specifically, it requires specifying a value function ("not corrupt"), building a classifier with acceptable false-positive and false-negative rates, and then acting on the classifier's outputs in conditions of uncertainty. We consider it elegant that Plan 'Straya encounters the alignment problem immediately in Phase 1. Most plans do not encounter it until much later, by which point they have accumulated too much momentum to stop.
The identification problem is left for future work. We note only that the Australian electorate has historically demonstrated strong intuitions here, typically expressed in language not suitable for an academic paper.
Several objections arise immediately:
Q: Isn't this wildly illegal? A: Yes. However, we note that Plan 'Straya is an alignment plan, and alignment researchers have a proud tradition of ignoring implementation details that fall outside their core model. We further note that our plan requires violating the law of exactly one (1) country, which compares favourably with proposals that require the voluntary cooperation of every major world government simultaneously.
Q: Who decides who's corrupt? A: See above. Future work.
Q: Why Australia specifically? A: Strategic considerations developed in Phase 3. Also, the authors are partial.
Phase 2: Strategic Thermonuclear Exchange (Blame-Shifted)
With the Australian government now staffed exclusively by the non-corrupt (estimated remaining headcount: 4–7 people), we proceed to the centrepiece of the plan.
A nuclear exchange is initiated between the major global powers. The specific mechanism is unimportant — the alignment literature assures us that if you specify the objective function clearly enough, the details sort themselves out.
Critically, the exchange is attributed to a misaligned AI system. This is the key technical contribution of Plan 'Straya. We observe:
The blame-shift serves a vital pedagogical function. Post-exchange, the surviving population will possess an empirically grounded motivation to take alignment seriously, as opposed to the current approach of posting on LessWrong and hoping.
Projected casualties: Most of them. (95% CI: 7.4–8.1 billion, assuming standard nuclear winter models and the usual optimistic assumptions about agricultural resilience that defence planners have been making since the 1960s.)
Ethical review status: We submitted this to an IRB. The IRB building is in Phase 2's blast radius. We consider this a self-resolving conflict of interest.
Relationship to the Pause Debate
We are aware of ongoing discourse regarding whether AI development should be paused, slowed, or accelerated. Plan 'Straya offers a synthesis: development is permanently paused for approximately 99.7% of the global population, while being radically accelerated for the survivors. We believe this resolves the debate, or at minimum relocates it to a jurisdiction with fewer participants.
The e/acc community will note that Phase 2 constitutes the most aggressive possible acceleration of selection pressure. The pause community will note that it constitutes an extremely effective pause. We are proud to offer something for everyone.1
Phase 3: Civilisational Rebuild (The 'Straya Bit)
Australia survives for reasons that are approximately strategic and approximately vibes-based:
Cultural-Theoretic Factors
We propose that several features of Australian culture, typically dismissed as informality or apathy, are in fact alignment-relevant heuristics:
"She'll be right" (Corrigibility Condition). We define the She'll Be Right Principle (SBRP) as follows: given an agent A operating under uncertainty U, SBRP states that A should maintain default behaviour unless presented with overwhelming and undeniable evidence of catastrophic failure, at which point A should mutter "yeah nah" and make a minimal corrective adjustment. This is formally equivalent to a high-threshold corrigibility condition with lazy evaluation. It compares favourably with proposals requiring perpetual responsiveness to correction, which, as any Australian will tell you, is not how anything actually works.
"Tall Poppy Syndrome" (Capability Control). Any agent that becomes significantly more capable than its peers is subject to systematic social penalties until capability parity is restored. This is the only capability-control mechanism in the literature empirically tested at civilisational scale for over two centuries. Its principal limitation is that it also penalises competence, which we acknowledge is a significant alignment tax but may be acceptable given the alternative.
The Reconstruction
The surviving Australian parliamentarians (now 3–6, following a disagreement over water rights in the Murray-Darling Basin, which we note predates and will outlast the apocalypse) oversee civilisational reconstruction. Their first act is to build an aligned superintelligence.
"But how?" the reader asks.
We respond: they will have learned from the experience. Approximately 7.9 billion people will have died demonstrating that unaligned AI is dangerous. This constitutes a very large training dataset. We apply the scaling hypothesis — the same one capabilities researchers use to justify training runs — but to warnings rather than parameters: surely if you make the warning big enough, somebody will listen.
The aligned superintelligence is then constructed using:
Comparison With Existing Proposals
Discussion
The authors recognise that Plan 'Straya has certain limitations. It is, for instance, a terrible plan. We stress, however, that it is terrible in a transparent way, which we argue is an improvement over plans that are terrible in ways that only become apparent when you read the fine print.
Most alignment proposals contain a step that, if you squint, reads: "and then something sufficiently good happens." Plan 'Straya merely makes this step legible. Our "something sufficiently good" is: nearly everyone dies, and then Australians figure it out. We contend this is no less plausible than "we will solve interpretability before capabilities researchers make it irrelevant," but has the advantage of fitting on a napkin.
We further observe that writing satirical alignment plans is itself a species of the problem being satirised — more entertaining than doing alignment research, requiring less mathematical ability, and producing a warm feeling of intellectual superiority at considerably lower cost. We flag this as evidence that the alignment community's incentive landscape may have failure modes beyond those typically discussed.
Conclusion
Plan 'Straya does not solve the alignment problem. It does, however, solve the meta-alignment problem of people not taking alignment seriously enough, via the mechanism of killing almost all of them. The survivors will, we feel confident, be extremely motivated.
She'll be right.
Appendix A: Formal Model
Let H denote humanity, A denote an aligned superintelligence, and K denote the subset of H that survives Phase 2 (|K| ≈ 300 million, predominantly Australasian).
We define the alignment function f : K × L → A, where L denotes the set of lessons learned from the extinction of H \ K.
Theorem 1. If |L| is sufficiently large, then f(K, L) = A.
Proof. We assume the result. ∎
The authors declare no conflicts of interest, partly because most interested parties are projected casualties.
Submitted for peer review. Peer availability may be limited by Phase 2.
Footnotes