"Mindcrime"Mindcrime" is Nick Bostrom's suggested term for scenarios in which an AI's cognitive processes are intrinsically doing moral harm, for example because the AI contains trillions of suffering conscious beings inside it.
An unrealistic example of this would be Solomonoff induction,induction, where predictions are made by means that include running many possible simulations of the environment and seeing which ones best correspond to reality. Among current machine learning algorithms, particle filters and Monte Carlo algorithms similarly involve running many possible simulated versions of a system.
(EliezerYudkowskyEliezer Yudkowsky has advocated that we shouldn't let any AI short of extreme levels of safety and robustness assurance consider distant civilizations in lots of detail in any case, since this means our AI might embed (a model of) a hostile superintelligence.)
Gloss: A huge amount of harm could occur if a machine intelligence turns out to contain lots of conscious subprograms enduring poor living conditions. One worry is that this might happen if an AI models humans in too much detail.
Technical summary: 'Mindcrime' is Bostrom's term for mind designs producing moral harm by their internal operation, particularly through containing sentient subprocesses embedded in the code. One worry is that mindcrime might arise in the course of an agent trying to predict or manipulate the humans in its environment, since this implies a pressure to model the humans in faithful detail. This is especially concerning since several value alignment proposals would explicitly call for modeling humans in detail, e.g. extrapolated volition and approval-based agents. Another problem scenario is if the natural design for an efficient subprocess involves independent consciousness (though it is a separate question if this optimal design involves pain or suffering). Computationally powerful agents might contain vast numbers of trapped conscious subprocesses, qualifying this as a global catastrophic risk.
"Mindcrime" is Nick Bostrom'Bostrom's suggested term for scenarios in which an AI's cognitive processes are intrinsically doing moral harm, for example because the AI contains trillions of suffering conscious beings inside it.
Three possible origins of a convergent instrumental pressure to consider intelligent civilizations in great detail:
-
(YudkowskyEliezerYudkowsky has advocated that we shouldn't let any AI short of extreme levels of safety and robustness assurance consider distant civilizations in lots of detail in any case, since this means our AI might embed (a model of) a hostile superintelligence.)
Yudkowsky terms aA nonperson predicate anyis an effective test that we, or better, an AI, can use to determine that some computer program is definitely not a person. In principle, a nonperson predicate needs only two possible outputs, "Don't know" and "Definitely not a person". It's acceptable for many actually-nonperson programs to be labeled "don't know", so long as no people are labeled "definitely not a person".
If the above was the only requirement, one simple nonperson predicate would be to label the program "return 0" as "definitely not a person" and label all other programseverything "don't know". The implicit difficult requirementdifficulty is that the nonperson predicate must also pass some programs of high complexity that do things like "acceptably model humans" or "acceptably model future versions of the AI".
Yudkowsky is currently pessimistic about relying on be able to find a nonperson predicate in time, though he considers it something to keep poking at.
Mindblind genieBehaviorism: Try to create a limited AI that does not model other minds or possibly even itself,
ButlerianMindblind genie: Try to create a limited AI that does not model other minds or possibly even itself, except using some narrow class of agent models that we are pretty sure will not be sentient. This avenue is potentially motivated for other reasons as well, such as avoiding Christiano's hack and averting programmer manipulation.
Try to define a nonperson predicate that whitelists enough programs to carry out some pivotal achievement.
Try for an AI that can bootstrap our understanding of consciousness and tell us about what we would define as a person, while committing a relatively small amount of mindcrime, with all computed possible-people being stored rather than discarded, and the modeled agents being entirely happy, mostly happy, or non-suffering. E.g., put a happy person at the center of the approval-directed agent, and try to oversee the AI's algorithms and ask it not to use Monte Carlo simulations if possible.
Ignore the problem in all pre-interstellar stages because it's still relatively small compared to astronomical stakes and therefore not worth significant losses in success probability. (This may backfire under some versions of the Simulation Hypothesis.)
Try to finish the philosophical problem of understanding which causal processes experience sapience (or are otherwise objects of ethical value), in the next couple of decades, to sufficient detail that it can be crisply stated to an AI, with sufficiently complete coverage that it's not subject to the Nearest Neighbor problem.
With respect to the latter two possibilities, note that the AI does not need to be considering possibilities in which the whole Earth as we know it is a simulation. The AI only needs to consider that, among the possible explanations of the AI's current sense data and internal data, there are scenarios in which the AI is embedded in some world other than the most 'obvious' one implied by the sense data. See also Christiano's HackDistant superintelligences can coerce the most probable environment of your AI for a related hazard of the AI considering possibilities in which it is being simulated.
[Summary: 'Mindcrime' is Bostrom's suggested term for the moral catastrophe that occurs if a [2c machine intelligence] contains enormous numbers of conscious beings trapped inside its code, in poor living conditions. This could happen as a result of self-awareness being a natural property of computationally efficient subprocesses. Perhaps more worryingly, the best model of a person may be a person itself, even if they're not the same person. This means that AIs trying to model humans might be unusually likely to create hypotheses and simulations that are themselves conscious.]
[Summary: 'Mindcrime' is Bostrom's suggested term for the moral catastrophe that occurs if a [2c machine intelligenceintelligence] contains enormous numbers of conscious beings trapped inside its code, in poor living conditions. This could happen as a result of self-awareness being a natural property of computationally efficient subprocesses. Perhaps more worryingly, the best model of a person may be a person itself, even if they're not the same person. This means that AIs trying to model humans might be unusually likely to create hypotheses and simulations that are themselves conscious.]