The goal of this post is to collect feedback on a new project idea. This work would be a collaboration between the AI Objectives Institute (AOI), MetaGov and other partner institutions.
Project PIs: Bruno Marnette (AOI) and Philipp Zahn (MetaGov, 20squares).
Special thanks to Colleen McKenzie, Matija Franklin, Timothy Telleen-Lawton, Gaia Dempsay, Ping Yee, Justin Stimatze, Cleo Nardo, Tushant Jha, Deger Turan and others for feedback and review.
Humans routinely model what the incentives of other humans may be. Historians look at the incentives of powerful individuals to explain why they made specific decisions. Journalists look at the incentives of large companies to uncover collusions and conflicts of interests. Economists use incentive models to understand markets. Lawmakers look into existing incentives before introducing new ones.
There is however a natural limit...
Describing the waluigi states as stable equilibria and the luigi states as unstable equilibria captures most of what you're describing in the last paragraph here, though without the amplitude of each.