Reflections from an AI Futures TTX:
Today I played an AI Futures tabletop exercise [1]. I'd done one before with a scenario similar to Plan A, where the US executive is much more doomy and competent than we expect and leads an international agreement to pause AI for as long as possible. This time, we played "Optimal China", where instead China was doomy and competent, and considered both x-risk and US AI dominance unacceptable. All other players (frontier labs, safety community, Europe) basically role-play the way we expect them to act, and the AI player draws the AI's goals from a distribution that includes misaligned goals, aligned goals, and combinations thereof.
In Plan A the humans won easily because they successfully paused, whereas in "Optimal China", the AIs won easily.
* China has much less leverage than the US does in pushing forward a pause treaty, due to the US's compute advantage and international influence.
* In Plan A, the US is proposing the deal. The US's BATNA is basically winning the AI race and accepting some takeover risk, so China is happy to accept ~any deal that gives them a share of the lightcone. China couldn't even convince economic partners like Brazil to support their counterproposal, because they were afraid to oppose the US.
* In "Optimal China", China can be ahead of the US in safety, putting all their resources into closing the compute gap, and still in a difficult position if the US and leading lab don't care about takeover risk. They have nothing to offer a leading, intransigent US.
* Daniel K thinks China could have played more aggressively. The scenario starts pretty late-- about 9 months before superintelligence-- so China just doesn't have much time to act.
Other takeaways:
* Effective world takeover by persuasion can happen early in the game (when the AI is very good at persuasion but not superhuman). Past a certain capability level, everyone strategically relevant must have AI advisors in their ear. The AIs use