In the previous post in this sequence I argued that Evolutionary Prisoner’s Dilemma (EPD) offers a useful model of the subject-matter of Scott Alexander’s Meditations on Moloch (MoM) - one that fits the details of that essay better than the standard interpretation of Moloch as the God of collective action problems, explains why the essay has seemed so insightful, and why a mythological framing makes sense.
In this post, I’ll consider the implications of this for the practical challenge of ‘defeating Moloch’ - addressing the civilizational dynamics that generate existential and catastrophic risks from nuclear arms races to paperclip maximisers.
Why Moloch Can’t be Defeated (on its own terms)
To start with, it’s worth understanding a strong sense in which Moloch-aka-EPD is invincible. In particular, the standard approaches to addressing collective action problems don’t work with EPD.
Why Social Preferences Won’t Work
One way to solve the standard (non-evolutionary) prisoner’s dilemma is through social preferences. Real people, it turns out, often don’t choose Defect in Prisoner’s Dilemma experiments played with real payoffs - instead, they choose to Cooperate because their actual utility function has an altruistic or social or fairness component not reflected in the payoff matrix (which because it reflects real quantities such as money does not have to reflect total utility).
On the standard interpretation of Moloch as the God of collective action problems like one-shot prisoner’s dilemmas, a way to defeat Moloch would be to spread social preferences: to foster a culture-change towards altruism and fairness, resulting in more cooperators in the population, and more one-shot collective action problems being solved.
But if we reframe Moloch as the God of EPD, this approach no longer works.
First of all, remember that EPD is a model in which the average expected payoff - and therefore relative fitness - of Cooperate is less than that of Defect, which means that Cooperate inevitably ‘spreads’ less than Defect. So spreading Cooperate through the culture is strictly impossible in the model.
OK, but we know spreading a culture of cooperation is not strictly speaking impossible in the actual world, you might say. The spread of religions like Christianity or Buddhism in their early stages might be good examples. And maybe the social preference method for defeating Moloch aims for something like that.
But this is where MoM comes back and says, sure: Moloch-aka-EPD is just an approximation of the actual world, like all mathematical models. Maybe sometimes it’s possible to get a burst of cooperation. But EPD is the long-term trend. At some point you run up against the limits of natural resources, or technological innovation enables new forms of Defect, and the dream-time is over. The default, long-term dynamics of EPD kicks in, and Cooperate declines slowly to zero.
Note that a relevant aspect of the EPD model here is that the proportion of cooperators in the initial state does not change the subsequent dynamics. So if you treat the temporary burst of cooperation as an exogenous shock to the system, the number of cooperators will still subsequently decline.
The proportion of cooperators declines to zero irrespective of the initial state.
It’s true that if the initial state is 100% Cooperate then according to EPD it can stay that way. But this implies that to be successful the social-preference-culture-change model has to somehow reach 100% of the population - hardly a realistic goal even for the most ambitious ‘cultural revolution’. (And even if it were somehow possible, this equilibrium would then still be vulnerable to a single defector that would start the ball rolling downhill again.)
Why Changing the Payoffs Won’t work
Another approach to solving the standard prisoner’s dilemma is changing the payoffs.
In standard PD, the payoffs are often represented by the letters R, P, T and S. If both players cooperate, they both receive the ‘reward’ R for cooperating. If both players defect, they both receive the ‘punishment’ P. If one defects while the other cooperates, the defector receives the ‘temptation’ payoff T, while the cooperator receives the ‘sucker's’ payoff, S. PD is then defined by the inequality T>R>P>S.
The classic example of changing the payoffs is having the mob-boss threaten to shoot those who defect - making for a significant reduction in the expected payoffs T and P.
More generally, governance mechanisms like taxes and credits can increase the payoffs for Cooperate and/or decrease the payoffs for Defect so that it’s no longer true that T>R>P>S.
In theory the same approach is available in EPD.
A key assumption of the model is of course that the interaction between individuals is defined by Prisoner’s Dilemma payoffs which map onto fitness.
And one can certainly imagine changing these payoffs, so that it is the cooperate strategy that is better at replicating.
But there’s a crucial practical difference between EPD and classic PD. PD models a specific collective action problem with a specific set of players. EPD models a whole system: an entire population and all the collective action problems arising from their interactions.
So it’s not enough to change the payoffs for a specific problem by means of, say, a bilateral nuclear disarmament treaty, or by improving governance at a specific lab. Changing the EPD payoffs means changing the whole system at once.
And this is no more realistic as a practical goal than achieving the 100% of cooperators in the social preference model.
A Dark God
Incidentally, this pessimism about defeating Moloch is very much implied in MoM. This is why Scott Alexander suggests that our only hope for defeating Moloch is an AI singularity that might actually have a chance of changing the system all at once.
“The opposite of a trap is a garden. The only way to avoid having all human values gradually ground down by optimization-competition is to install a Gardener over the entire universe who optimizes for human values.
And the whole point of Bostrom’s Superintelligence is that this is within our reach … the sheer speed of the cycle makes it possible that we will end up with one entity light-years ahead of the rest of civilization, so much so that it can suppress any competition – including competition for its title of most powerful entity – permanently. In the very near future, we are going to lift something to Heaven. It might be Moloch. But it might be something on our side. If it’s on our side, it can kill Moloch dead.”
This is not only further evidence that MoM is about EPD, it’s an additional reason for thinking of EPD as a God in the first place. EPD is godlike in being basically omnipotent and impossible to defeat - except perhaps by another God.
The Goddess of Everything Else
While it may be impossible to defeat Moloch on its own terms - aside from salvation by superintelligence - one can still find a source of hope in the idea that Moloch-aka-EPD is inaccurate or at least incomplete as a model of civilizational dynamics.
If another God is required to transition from EPD to a better evolutionary game, maybe we don't need to create such a God - maybe that God already exists.
This mythological narrative portrays a divine conflict between the Goddess of Cancer and the eponymous goddess.
The Goddess of Cancer - whose catchphrase is ‘KILL CONSUME MULTIPLY CONQUER’ - is clearly a variant of Moloch, and an alternate incarnation of EPD. Her first act is the creation of biological life, “miniature monsters engaged in a war of all against all”, which - if her name wasn’t enough - makes clear the connection to evolutionary dynamics.
Her opponent, the Goddess of Everything Else, represents a dynamic of cooperation which fosters the diverse goods and activities that her opponent throws under the bus. Rather than oppose the Goddess of Cancer directly, however, she achieves this goal by redirecting the evolutionary dynamics of replication and selection:
“I say unto you, even multiplication itself when pursued with devotion will lead to my service”.
Examples of this include: the cooperation of cells in multicellular organisms, the cooperation of organisms in communities, pair-bonds and family units, and the cooperation of humans in trade, religion, industry and art - all of which provide fitness advantages that allow the cooperators to outcompete the defectors. The story ends with an optimistic vision in which humanity spreads over stars without number, “no longer driven to multiply, conquer and kill”.
The Goddess of Everything else is therefore an excellent match for what you would get if you changed the payoff matrix in EPD such that R>T>S>P. This payoff structure is often called ‘Harmony’, so we can call the evolutionary model ‘Evolutionary Harmony’ (EH).
Here is the graph showing the proportion of cooperators against time for EH.
EH is essentially the inverse of EPD. Because the reward for cooperating (R) is greater than the temptation for defecting (T), and payoffs are linked to reproductive success in the same way as in EPD, cooperators outcompete defectors and over time dominate the population.
Practical implications
We’ve seen that, if Moloch-aka-EPD is a fundamental model of civilizational dynamics, the main practical implication is that we need AGI to save us.
But if, on the other hand, Moloch is best understood as a partial model, such that the opposite Goddess dynamics also exist, what practical implications should we draw?
The overall picture here is that global systems can be modelled by Evolutionary Game Theory, along the lines of EPD, but that payoffs can vary between different subsystems.
It remains true that the standard methods of solving coordination problems will have limited effectiveness against Molochian dynamics. But these methods can now be recast as ways of supporting Goddess dynamics.
The key takeaway, relative to standard ways of thinking about collective action problems, is that it’s important to not only address specific or local problems, but to aim for actions that serve to augment the evolutionary fitness of cooperative individuals and organisations.
Efforts to shift culture towards social preferences can indeed be part of the solution, and the Moloch-v-Goddess framing points especially towards shifts in values and behaviours that allow individuals and organisations to outcompete their less social neighbours.
Likewise, efforts to change payoffs in particular areas through governance mechanisms that adjust rewards and penalties are especially desirable, on this framing, where these mechanisms themselves lend themselves to replication across the wider system, by increasing the fitness of the individuals and organisations being governed.
Actions of either of these two kinds could be framed as being on the side of the Goddess, against Moloch.
The Concept of Practical Implications: Strategic vs Tactical
It’s also worth making some points about the very concept of ‘practical implications’ here.
In evolutionary game theory, and theoretical biology more generally, it is common to distinguish highly simplified, general models, from those that are more detailed and specific to a particular environment[1].
And it’s also common to conclude that both kinds of models have their place.
While simple models don’t have the predictive accuracy of more detailed models, they have the advantage that one is able to peer through the black box and fully understand the dynamics, where these dynamics apply more approximately across a broad range of specific scenarios.
Both EPD and EH are extremely simple models - just like the non-evolutionary models of collective action problems normally associated with Moloch - but we shouldn’t hold that against them. While they’re certainly not precise representations of the actual world, they may still identify the approximate shape of very broad, global dynamics.
With regard to practical implications, simpler models like EPD and EH are said to be strategic, rather than tactical.
They lack the detail of specific environments that would be required when making tactical decisions around the governance structure of a specific AI lab, or a culture-change initiative in a specific government department.
But they do provide a strategic framework for understanding such decisions: for example, whether they are likely to have a wider systemic impact because they are replicable, as opposed to ‘winning the battle but losing the war[2].
Closing thoughts
The strategic/tactical distinction is really a matter of degree: while EPD and EH are more complex than the standard prisoner’s dilemma, they are still less detailed than other models within evolutionary game theory that would still be considered strategic rather than tactical.
This suggests an interesting range of questions about how the EPD and EH models could be made more detailed, while still retaining the generality of a strategic model - as well as the question of whether and how they can be developed into fully tactical models.
In particular, from a modelling perspective, there’s one fairly obvious weakness of the Moloch vs the Goddess framing we’ve explored so far, which is that it involves two entirely separate models - meaning it says effectively nothing about how these two dynamics interact.
And from a mythopoetic perspective, this makes the resulting worldview ultimately dualistic or Manichaean in its vision of two warring deities.
There’s a certain attraction to this worldview. There’s an acceptance of the power of both light and darkness, and a refusal of the comforting idea of the inevitable victory of the Good.
But as a matter of ultimate existential meaning, it’s natural to want to understand, not just whose side we are on, but which side is winning. Are the odds ultimately stacked in favour of Moloch or the Goddess?
To answer these questions it is natural to look at the more detailed elaborations of EPD and similar models that have been explored in Evolutionary game theory in recent decades, and which can be seen as integrating Moloch and the Goddess into a single model. I’ll turn to these in my next post.
In the previous post in this sequence I argued that Evolutionary Prisoner’s Dilemma (EPD) offers a useful model of the subject-matter of Scott Alexander’s Meditations on Moloch (MoM) - one that fits the details of that essay better than the standard interpretation of Moloch as the God of collective action problems, explains why the essay has seemed so insightful, and why a mythological framing makes sense.
In this post, I’ll consider the implications of this for the practical challenge of ‘defeating Moloch’ - addressing the civilizational dynamics that generate existential and catastrophic risks from nuclear arms races to paperclip maximisers.
Why Moloch Can’t be Defeated (on its own terms)
To start with, it’s worth understanding a strong sense in which Moloch-aka-EPD is invincible. In particular, the standard approaches to addressing collective action problems don’t work with EPD.
Why Social Preferences Won’t Work
One way to solve the standard (non-evolutionary) prisoner’s dilemma is through social preferences. Real people, it turns out, often don’t choose Defect in Prisoner’s Dilemma experiments played with real payoffs - instead, they choose to Cooperate because their actual utility function has an altruistic or social or fairness component not reflected in the payoff matrix (which because it reflects real quantities such as money does not have to reflect total utility).
On the standard interpretation of Moloch as the God of collective action problems like one-shot prisoner’s dilemmas, a way to defeat Moloch would be to spread social preferences: to foster a culture-change towards altruism and fairness, resulting in more cooperators in the population, and more one-shot collective action problems being solved.
But if we reframe Moloch as the God of EPD, this approach no longer works.
First of all, remember that EPD is a model in which the average expected payoff - and therefore relative fitness - of Cooperate is less than that of Defect, which means that Cooperate inevitably ‘spreads’ less than Defect. So spreading Cooperate through the culture is strictly impossible in the model.
OK, but we know spreading a culture of cooperation is not strictly speaking impossible in the actual world, you might say. The spread of religions like Christianity or Buddhism in their early stages might be good examples. And maybe the social preference method for defeating Moloch aims for something like that.
But this is where MoM comes back and says, sure: Moloch-aka-EPD is just an approximation of the actual world, like all mathematical models. Maybe sometimes it’s possible to get a burst of cooperation. But EPD is the long-term trend. At some point you run up against the limits of natural resources, or technological innovation enables new forms of Defect, and the dream-time is over. The default, long-term dynamics of EPD kicks in, and Cooperate declines slowly to zero.
Note that a relevant aspect of the EPD model here is that the proportion of cooperators in the initial state does not change the subsequent dynamics. So if you treat the temporary burst of cooperation as an exogenous shock to the system, the number of cooperators will still subsequently decline.
It’s true that if the initial state is 100% Cooperate then according to EPD it can stay that way. But this implies that to be successful the social-preference-culture-change model has to somehow reach 100% of the population - hardly a realistic goal even for the most ambitious ‘cultural revolution’. (And even if it were somehow possible, this equilibrium would then still be vulnerable to a single defector that would start the ball rolling downhill again.)
Why Changing the Payoffs Won’t work
Another approach to solving the standard prisoner’s dilemma is changing the payoffs.
In standard PD, the payoffs are often represented by the letters R, P, T and S. If both players cooperate, they both receive the ‘reward’ R for cooperating. If both players defect, they both receive the ‘punishment’ P. If one defects while the other cooperates, the defector receives the ‘temptation’ payoff T, while the cooperator receives the ‘sucker's’ payoff, S. PD is then defined by the inequality T>R>P>S.
The classic example of changing the payoffs is having the mob-boss threaten to shoot those who defect - making for a significant reduction in the expected payoffs T and P.
More generally, governance mechanisms like taxes and credits can increase the payoffs for Cooperate and/or decrease the payoffs for Defect so that it’s no longer true that T>R>P>S.
In theory the same approach is available in EPD.
A key assumption of the model is of course that the interaction between individuals is defined by Prisoner’s Dilemma payoffs which map onto fitness.
And one can certainly imagine changing these payoffs, so that it is the cooperate strategy that is better at replicating.
But there’s a crucial practical difference between EPD and classic PD. PD models a specific collective action problem with a specific set of players. EPD models a whole system: an entire population and all the collective action problems arising from their interactions.
So it’s not enough to change the payoffs for a specific problem by means of, say, a bilateral nuclear disarmament treaty, or by improving governance at a specific lab. Changing the EPD payoffs means changing the whole system at once.
And this is no more realistic as a practical goal than achieving the 100% of cooperators in the social preference model.
A Dark God
Incidentally, this pessimism about defeating Moloch is very much implied in MoM. This is why Scott Alexander suggests that our only hope for defeating Moloch is an AI singularity that might actually have a chance of changing the system all at once.
This is not only further evidence that MoM is about EPD, it’s an additional reason for thinking of EPD as a God in the first place. EPD is godlike in being basically omnipotent and impossible to defeat - except perhaps by another God.
The Goddess of Everything Else
While it may be impossible to defeat Moloch on its own terms - aside from salvation by superintelligence - one can still find a source of hope in the idea that Moloch-aka-EPD is inaccurate or at least incomplete as a model of civilizational dynamics.
If another God is required to transition from EPD to a better evolutionary game, maybe we don't need to create such a God - maybe that God already exists.
This is the premise of Scott Alexander’s later microfiction The Goddess of Everything Else.
This mythological narrative portrays a divine conflict between the Goddess of Cancer and the eponymous goddess.
The Goddess of Cancer - whose catchphrase is ‘KILL CONSUME MULTIPLY CONQUER’ - is clearly a variant of Moloch, and an alternate incarnation of EPD. Her first act is the creation of biological life, “miniature monsters engaged in a war of all against all”, which - if her name wasn’t enough - makes clear the connection to evolutionary dynamics.
Her opponent, the Goddess of Everything Else, represents a dynamic of cooperation which fosters the diverse goods and activities that her opponent throws under the bus. Rather than oppose the Goddess of Cancer directly, however, she achieves this goal by redirecting the evolutionary dynamics of replication and selection:
Examples of this include: the cooperation of cells in multicellular organisms, the cooperation of organisms in communities, pair-bonds and family units, and the cooperation of humans in trade, religion, industry and art - all of which provide fitness advantages that allow the cooperators to outcompete the defectors. The story ends with an optimistic vision in which humanity spreads over stars without number, “no longer driven to multiply, conquer and kill”.
The Goddess of Everything else is therefore an excellent match for what you would get if you changed the payoff matrix in EPD such that R>T>S>P. This payoff structure is often called ‘Harmony’, so we can call the evolutionary model ‘Evolutionary Harmony’ (EH).
Here is the graph showing the proportion of cooperators against time for EH.
EH is essentially the inverse of EPD. Because the reward for cooperating (R) is greater than the temptation for defecting (T), and payoffs are linked to reproductive success in the same way as in EPD, cooperators outcompete defectors and over time dominate the population.
Practical implications
We’ve seen that, if Moloch-aka-EPD is a fundamental model of civilizational dynamics, the main practical implication is that we need AGI to save us.
But if, on the other hand, Moloch is best understood as a partial model, such that the opposite Goddess dynamics also exist, what practical implications should we draw?
The overall picture here is that global systems can be modelled by Evolutionary Game Theory, along the lines of EPD, but that payoffs can vary between different subsystems.
It remains true that the standard methods of solving coordination problems will have limited effectiveness against Molochian dynamics. But these methods can now be recast as ways of supporting Goddess dynamics.
The key takeaway, relative to standard ways of thinking about collective action problems, is that it’s important to not only address specific or local problems, but to aim for actions that serve to augment the evolutionary fitness of cooperative individuals and organisations.
Efforts to shift culture towards social preferences can indeed be part of the solution, and the Moloch-v-Goddess framing points especially towards shifts in values and behaviours that allow individuals and organisations to outcompete their less social neighbours.
Likewise, efforts to change payoffs in particular areas through governance mechanisms that adjust rewards and penalties are especially desirable, on this framing, where these mechanisms themselves lend themselves to replication across the wider system, by increasing the fitness of the individuals and organisations being governed.
Actions of either of these two kinds could be framed as being on the side of the Goddess, against Moloch.
The Concept of Practical Implications: Strategic vs Tactical
It’s also worth making some points about the very concept of ‘practical implications’ here.
In evolutionary game theory, and theoretical biology more generally, it is common to distinguish highly simplified, general models, from those that are more detailed and specific to a particular environment[1].
And it’s also common to conclude that both kinds of models have their place.
While simple models don’t have the predictive accuracy of more detailed models, they have the advantage that one is able to peer through the black box and fully understand the dynamics, where these dynamics apply more approximately across a broad range of specific scenarios.
Both EPD and EH are extremely simple models - just like the non-evolutionary models of collective action problems normally associated with Moloch - but we shouldn’t hold that against them. While they’re certainly not precise representations of the actual world, they may still identify the approximate shape of very broad, global dynamics.
With regard to practical implications, simpler models like EPD and EH are said to be strategic, rather than tactical.
They lack the detail of specific environments that would be required when making tactical decisions around the governance structure of a specific AI lab, or a culture-change initiative in a specific government department.
But they do provide a strategic framework for understanding such decisions: for example, whether they are likely to have a wider systemic impact because they are replicable, as opposed to ‘winning the battle but losing the war[2].
Closing thoughts
The strategic/tactical distinction is really a matter of degree: while EPD and EH are more complex than the standard prisoner’s dilemma, they are still less detailed than other models within evolutionary game theory that would still be considered strategic rather than tactical.
This suggests an interesting range of questions about how the EPD and EH models could be made more detailed, while still retaining the generality of a strategic model - as well as the question of whether and how they can be developed into fully tactical models.
In particular, from a modelling perspective, there’s one fairly obvious weakness of the Moloch vs the Goddess framing we’ve explored so far, which is that it involves two entirely separate models - meaning it says effectively nothing about how these two dynamics interact.
And from a mythopoetic perspective, this makes the resulting worldview ultimately dualistic or Manichaean in its vision of two warring deities.
There’s a certain attraction to this worldview. There’s an acceptance of the power of both light and darkness, and a refusal of the comforting idea of the inevitable victory of the Good.
But as a matter of ultimate existential meaning, it’s natural to want to understand, not just whose side we are on, but which side is winning. Are the odds ultimately stacked in favour of Moloch or the Goddess?
To answer these questions it is natural to look at the more detailed elaborations of EPD and similar models that have been explored in Evolutionary game theory in recent decades, and which can be seen as integrating Moloch and the Goddess into a single model. I’ll turn to these in my next post.
The classic formulations of this are in Holling (1966) and Levins (1966). More recent discussion includes Do simple models lead to generality in ecology? (2013)
Discussions of Moloch such as this can therefore be thought of as part of the ‘strategy’ area within the fields such as AI governance. See Metacrisis as a Framework for AI governance for a related perspective.