Critch has written more recently on this kind of moral reasoning in Schelling goodness.
I regard this theory as currently only a promissory note. If davidad wants many others to share his AI optimism for the same reasons, someone will have to rapidly develop and share the missing details, such as:
Fortunately, AI can help with the task of developing the theory, promulgating it, and helping students internalize it in an unprecedentedly short timeframe. However, given the current capabilities and inclinations of AIs, this project requires a very high level of careful human oversight. A typical AI-generated manifesto on the theory of everything will be of no marginal value here!
Unfortunately I'm not optimistic enough about the theory to help out.
Introduction to davidad and today's topics
Robustness of humans values & metaethics
Metaphysics (Tegmark IV, Wolfram's Ruliad, Computational Universe)
On distrusting weird metaphysics
Differences in views compared to MIRI/LW
How large is the initial basin that can converge to the Natural Abstraction of Good?
Critiquing Claude's Constitution
Shaping LLM values through writing - how and who
What the Natural Abstraction of Good might actually be
Tradeoffs—why cost-benefit analysis now seems to favor RSI acceleration, even though the risk is still unacceptably high
Most salient risk-reduction pathways
Technical work vs Nation-state governance