R:A-Z Glossary

This is a list of brief explanations and definitions for terms that Eliezer Yudkowsky uses in the book Rationality: From AI to Zombies, an edited version of the Sequences.

The glossary is a community effort, and you're welcome to improve on the entries here, or add new ones. See the Talk page for some ideas for unwritten entries.

A

a priori. A sentence that is reasonable to believe even in the absence of any experiential evidence (outside of the evidence needed to understand the sentence). A priori claims are in some way introspectively self-evident, or justifiable using only abstract reasoning. For example, pure mathematics is often claimed to be a priori, while scientific knowledge is claimed to be a posteriori, or dependent on (sensory) experience. These two terms shouldn’t be confused with prior and posterior probabilities.

ad hominem. A verbal attack on the person making an argument, where a direct criticism of the argument is possible and would be more relevant. The term is reserved for cases where talking about the person amounts to changing the topic. If your character is the topic from the outset (e.g., during a job interview), then it isn't an ad hominem fallacy to cite evidence showing that you're a lousy worker.

affective death spiral. A halo effect that perpetuates and exacerbates itself over time.

AI-Box Experiment. A demonstration by Yudkowsky that people tend to overestimate how hard it is to manipulate people, and therefore underestimate the risk of building an Unfriendly AI that can only interact with its environment by verbally communicating with its programmers. One participant role-plays an AI, while another role-plays a human whose job it is interact with the AI without voluntarily releasing the AI from its “box”. Yudkowsky and a few other people who have role-played the AI have succeeded in getting the human supervisor to agree to release them, which suggests that a superhuman intelligence would have an even easier time escaping.

algorithm. A specific procedure for computing some function. A mathematical object consisting of a finite, well-defined sequence of steps that concludes with some output determined by its initial input. Multiple physical systems can simultaneously instantiate the same algorithm.

alien god. One of Yudkowsky's pet names for natural selection.

ambiguity aversion. Preferring small certain gains over much larger uncertain gains.

amplitude. A quantity in a configuration space, represented by a complex number. Amplitudes are physical, not abstract or formal. The complex number’s modulus squared (i.e., its absolute value multiplied by itself) yields the Born probabilities, but the reason for this is unknown.

anchoring. The cognitive bias of relying excessively on initial information after receiving relevant new information.

anthropomorphism. The tendency to assign human qualities to non-human phenomena.

artificial neural network. See “neural network.”

ASCII. The American Standard Code for Information Exchange. A very simple system for encoding 128 ordinary English letters, numbers, and punctuation.

B

beisutsukai. Japanese for "Bayes user." A fictional order of high-level rationalists, also known as the Bayesian Conspiracy.

Berkeleian idealism. The belief, espoused by George Berkeley, that things only exist in various minds (including the mind of God).

bias. (a) A cognitive bias. In Rationality: A to Z, this will be the default meaning. (b) A statistical bias. (c) An inductive bias. (d) Colloquially: prejudice or unfairness.

black box. Any process whose inner workings are mysterious or poorly understood.

blind god. One of Yudkowsky's pet names for natural selection.

bucket. See “pebble and bucket.”

C

comparative advantage. An ability to produce something at a lower cost than some other actor could. This is not the same as having an absolute advantage over someone: you may be a better cook than someone across-the-board, but that person will still have a comparative advantage over you at cooking some dishes. This is because your cooking skills make your time more valuable; the worse cook may have a comparative advantage at baking bread, for example, since it doesn’t cost them much to spend a lot of time on baking, whereas you could be spending that time creating a large number of high-quality dishes. Baking bread is more costly for the good cook than for the bad cook because the good cook is paying a larger opportunity cost, i.e., is giving up more valuable opportunities to be doing other things.

conjunction. A compound sentence asserting two or more distinct things, such as "A and B" or "A even though B." The conjunction fallacy is the tendency to count some conjunctions as more probable than their components even though they can’t be more probable (and are almost always less probable).

D

decision theory. (a) The mathematical study of correct decision-making in general, abstracted from an agent's particular beliefs, goals, or capabilities. (b) A well-defined general-purpose procedure for arriving at decisions, e.g., causal decision theory.

E

econblog. Economics blog.

edge. See “graph.”

entanglement. (a) Causal correlation between two things. (b) In quantum physics, the mutual dependence of two particles' states upon one another. Entanglement in sense (b) occurs when a quantum amplitude distribution cannot be factorized.

entropy. (a) In thermodynamics, the number of different ways a physical state may be produced (its Boltzmann entropy). E.g., a slightly shuffled deck has lower entropy than a fully shuffled one, because there are many more configurations a fully shuffled deck is likely to end up in. (b) In information theory, the expected value of the information contained in a message (its Shannon entropy). That is, a random variable’s Shannon entropy is how many bits of information one would be missing (on average) if one did not know the variable’s value. Boltzmann entropy and Shannon entropy have turned out to be equivalent; that is, a system’s thermodynamic disorder corresponds to the number of bits needed to fully characterize it.

epistemic. Concerning knowledge.

eutopia. Yudkowsky’s term for a utopia that’s actually nice to live in, as opposed to one that’s unpleasant or unfeasible.

evolution. (a) In biology, change in a population’s heritable features. (b) In other fields, change of any sort.

expected utility. A measure of how much an agent’s goals will tend to be satisfied by some decision, given uncertainty about the decision’s outcome. Accepting a 5% chance of winning a million dollars will usually leave you poorer than accepting a 100% chance of winning one dollar; nine times out of ten, the certain one-dollar gamble has higher actual utility. All the same, we say that the 10% shot at a million dollars is better (assuming dollars have utility for you) because it has higher expected utility in all cases: $1M multiplied by probability 0.05 > $1 multiplied by probability 1.

F

fitness. See “inclusive fitness.”

formalism. A specific way of logically or mathematically representing something.

function. A relation between inputs and outputs such that every input has exactly one output. A mapping between two sets in which every element in the first set is assigned a single specific element from the second.

G

graph. In graph theory, a mathematical object consisting of simple atomic objects (vertices, or nodes) connected by lines (edges) or arrows (arcs).

H

halting oracle. An abstract agent that is stipulated to be able to reliably answer questions that no algorithm can reliably answer. Though it is provably impossible for finite rule-following systems (e.g., Turing machines) to answer certain questions (e.g., the halting problem), it can still be mathematically useful to consider the logical implications of scenarios in which we could access answers to those questions.

happy death spiral. See “affective death spiral.”

hat tip. A grateful acknowledgment of someone who brought information to one's attention.

hedonic. Concerning pleasure.

heuristic. An imperfect method for achieving some goal. A useful approximation. Cognitive heuristics are innate, humanly universal brain heuristics.

I

idiot god. One of Yudkowsky's pet names for natural selection.

iff. If, and only if.

inclusive fitness. The degree to which a gene causes more copies of itself to exist in the next generation. Inclusive fitness is the property propagated by natural selection. Unlike individual fitness, which is a specific organism’s tendency to promote more copies of its genes, inclusive fitness is held by the genes themselves. Inclusive fitness can sometimes be increased at the expense of the individual organism’s overall fitness.

instrumental value. A goal that is only pursued in order to further some other goal.

Iterated Prisoner’s Dilemma. A series of Prisoner’s Dilemmas between the same two players. Because players can punish each other for defecting on previous rounds, they will usually more reason to cooperate than in the one-shot Prisoner’s Dilemma.

L

Lamarckism. The 19th-century pre-Darwinian hypothesis that populations evolve via the hereditary transmission of the traits practiced and cultivated by the previous generation.

M

Machine Intelligence Research Institute. A small non-profit organization that works on mathematical research related to Friendly AI. Yudkowsky co-founded MIRI in 2000, and is the senior researcher there.

magisterium. Stephen Gould’s term for a domain where some community or field has authority. Gould claimed that science and religion were separate and non-overlapping magisteria. On his view, religion has authority to answer questions of “ultimate meaning and moral value” (but not empirical fact) and science has authority to answer questions of empirical fact (but not meaning or value).

map and territory. A metaphor for the relationship between beliefs (or other mental states) and the real-world things they purport to refer to.

materialism. The belief that all mental phenomena can in principle be reduced to physical phenomena.

Maxwell’s Demon. A hypothetical agent that knows the location and speed of individual molecules in a gas. James Maxwell used this demon in a thought experiment to show that such knowledge could decrease a physical system’s entropy, “in contradiction to the second law of thermodynamics.” The demon’s ability to identify faster molecules allows it to gather them together and extract useful work from them. Leó Szilárd later pointed out that if the demon itself were considered part of the thermodynamic system, then the entropy of the whole would not decrease. The decrease in entropy of the gas would require an increase in the demon’s entropy. Szilárd used this insight to simplify Maxwell’s scenario into a hypothetical engine that extracts work from a single gas particle. Using one bit of information about the particle (e.g., whether it’s in the top half of a box or the bottom half), a Szilárd engine can generate log2(kT) joules of energy, where T is the system’s temperature and k is Boltzmann’s constant.

Maxwell’s equations. In classical physics, a set of differential equations that model the behavior of electromagnetic fields.

meme. Richard Dawkins’s term for a thought that can be spread through social networks.

meta level. A domain that is more abstract or derivative than some domain it depends on, the "object level." A conversation can be said to operate on a meta level, for example, when it switches from discussing a set of simple or concrete objects to discussing higher-order or indirect features of those objects.

metaethics. A theory about what it means for ethical statements to be correct, or the study of such theories. Whereas applied ethics speaks to questions like "Is murder wrong?" and "How can we reduce the number of murders?", metaethics speaks to questions like "What does it mean for something to be wrong?" and "How can we generally distinguish right from wrong?"

minimax. A decision rule for turn-based zero-sum two-player games, where one picks moves that minimize one's opponent’s chance of winning when their moves maximize their chance of winning. This rule is intended to perform well even in worst-case scenarios where one’s opponent makes excellent decisions.

Minimum Message Length Principle. A formalization of Occam’s Razor that judges the probability of a hypothesis based on how long it would take to communicate the hypothesis plus the available data. Simpler hypotheses are favored, as are hypotheses that can be used to concisely encode the data.

MIRI. See “Machine Intelligence Research Institute.”

money pump. A person who is irrationally willing to accept sequences of trades that add up to an expected loss.

monotonic logic. A logic that will always continue to assert something as true if it ever asserted it as true. For example, if “2+2=4” is proved, then in a monotonic logic no subsequent operation can make it impossible to derive that theorem again in the future. In contrast, non-monotonic logics can “forget” past conclusions and lose the ability to derive them.

monotonicity. In mathematics, the property, loosely speaking, of always moving in the same direction (when one moves at all). If I have a preference ordering over outcomes, a monotonic change to my preferences may increase or decrease how much I care about various outcomes, but it won’t change the order -- if I started off liking cake more than cookies, I’ll end up liking cake more than cookies, though any number of other changes may have taken place. Alternatively, a monotonic function can flip all of my preferences. The only option ruled out is for the function to sometimes flip the ordering and sometimes preserve the ordering. A non-monotonic function, then, is one that at least once take an xy, and at least once takes an x>y and outputs x<y.

Moore’s Law. The observation that technological progress has enabled engineers to double the number of transistors they can fit on an integrated circuit approximately every two years from the 1960s to the 2010s. Other exponential improvements in computing technology (some of which have also been called “Moore’s Law”) may continue to operate after the end of the original Moore’s Law. The most important of these is the doubling of available computations per dollar. The futurist Ray Kurzweil has argued that the latter exponential trend will continue for many decades, and that this trend will determine rates of AI progress.

motivated cognition. Reasoning and perception that is driven by some goal or emotion of the reasoner that is at odds with accuracy. Examples of this include non-evidence-based inclinations to reject a claim (motivated skepticism), to believe a claim (motivated credulity), to continue evaluating an issue (motivated continuation), or to stop evaluating an issue (motivated stopping).

Murphy’s law. The saying “Anything that can go wrong will go wrong.”

mutual information. For two variables, the amount that knowing about one variable tells you about the other's value. If two variables have zero mutual information, then they are independent; knowing the value of one does nothing to reduce uncertainty about the other.

This is a list of brief explanations and definitions for terms that Eliezer Yudkowsky uses in the book Rationality: From AI to Zombies, an edited version of the Sequences.

The glossary is a community effort, and you're welcome to improve on the entries here, or add new ones. See the Talk page for some ideas for unwritten entries.

A

a priori. A sentence that is reasonable to believe even in the absence of any experiential evidence (outside of the evidence needed to understand the sentence). A priori claims are in some way introspectively self-evident, or justifiable using only abstract reasoning. For example, pure mathematics is often claimed to be a priori, while scientific knowledge is claimed to be a posteriori, or dependent on (sensory) experience. These two terms shouldn’t be confused with prior and posterior probabilities.

ad hominem. A verbal attack on the person making an argument, where a direct criticism of the argument is possible and would be more relevant. The term is reserved for cases where talking about the person amounts to changing the topic. If your character is the topic from the outset (e.g., during a job interview), then it isn't an ad hominem fallacy to cite evidence showing that you're a lousy worker.

affective death spiral. A halo effect that perpetuates and exacerbates itself over time.

AI-Box Experiment. A demonstration by Yudkowsky that people tend to overestimate how hard it is to manipulate people, and therefore underestimate the risk of building an Unfriendly AI that can only interact with its environment by verbally communicating with its programmers. One participant role-plays an AI, while another role-plays a human whose job it is interact with the AI without voluntarily releasing the AI from its “box”. Yudkowsky and a few other people who have role-played the AI have succeeded in getting the human supervisor to agree to release them, which suggests that a superhuman intelligence would have an even easier time escaping.

algorithm. A specific procedure for computing some function. A mathematical object consisting of a finite, well-defined sequence of steps that concludes with some output determined by its initial input. Multiple physical systems can simultaneously instantiate the same algorithm.

alien god. One of Yudkowsky's pet names for natural selection.

ambiguity aversion. Preferring small certain gains over much larger uncertain gains.

amplitude. A quantity in a configuration space, represented by a complex number. Amplitudes are physical, not abstract or formal. The complex number’s modulus squared (i.e., its absolute value multiplied by itself) yields the Born probabilities, but the reason for this is unknown.

anchoring. The cognitive bias of relying excessively on initial information after receiving relevant new information.

anthropomorphism. The tendency to assign human qualities to non-human phenomena.

artificial neural network. See “neural network.”

ASCII. The American Standard Code for Information Exchange. A very simple system for encoding 128 ordinary English letters, numbers, and punctuation.

B

beisutsukai. Japanese for "Bayes user." A fictional order of high-level rationalists, also known as the Bayesian Conspiracy.

Berkeleian idealism. The belief, espoused by George Berkeley, that things only exist in various minds (including the mind of God).

bias. (a) A cognitive bias. In Rationality: A to Z, this will be the default meaning. (b) A statistical bias. (c) An inductive bias. (d) Colloquially: prejudice or unfairness.

black box. Any process whose inner workings are mysterious or poorly understood.

blind god. One of Yudkowsky's pet names for natural selection.

bucket. See “pebble and bucket.”

C

comparative advantage. An ability to produce something at a lower cost than some other actor could. This is not the same as having an absolute advantage over someone: you may be a better cook than someone across-the-board, but that person will still have a comparative advantage over you at cooking some dishes. This is because your cooking skills make your time more valuable; the worse cook may have a comparative advantage at baking bread, for example, since it doesn’t cost them much to spend a lot of time on baking, whereas you could be spending that time creating a large number of high-quality dishes. Baking bread is more costly for the good cook than for the bad cook because the good cook is paying a larger opportunity cost, i.e., is giving up more valuable opportunities to be doing other things.

conjunction. A compound sentence asserting two or more distinct things, such as "A and B" or "A even though B." The conjunction fallacy is the tendency to count some conjunctions as more probable than their components even though they can’t be more probable (and are almost always less probable).

D

decision theory. (a) The mathematical study of correct decision-making in general, abstracted from an agent's particular beliefs, goals, or capabilities. (b) A well-defined general-purpose procedure for arriving at decisions, e.g., causal decision theory.

E

econblog. Economics blog.

edge. See “graph.”

entanglement. (a) Causal correlation between two things. (b) In quantum physics, the mutual dependence of two particles' states upon one another. Entanglement in sense (b) occurs when a quantum amplitude distribution cannot be factorized.

entropy. (a) In thermodynamics, the number of different ways a physical state may be produced (its Boltzmann entropy). E.g., a slightly shuffled deck has lower entropy than a fully shuffled one, because there are many more configurations a fully shuffled deck is likely to end up in. (b) In information theory, the expected value of the information contained in a message (its Shannon entropy). That is, a random variable’s Shannon entropy is how many bits of information one would be missing (on average) if one did not know the variable’s value. Boltzmann entropy and Shannon entropy have turned out to be equivalent; that is, a system’s thermodynamic disorder corresponds to the number of bits needed to fully characterize it.

epistemic. Concerning knowledge.

eutopia. Yudkowsky’s term for a utopia that’s actually nice to live in, as opposed to one that’s unpleasant or unfeasible.

evolution. (a) In biology, change in a population’s heritable features. (b) In other fields, change of any sort.

expected utility. A measure of how much an agent’s goals will tend to be satisfied by some decision, given uncertainty about the decision’s outcome. Accepting a 5% chance of winning a million dollars will usually leave you poorer than accepting a 100% chance of winning one dollar; nine times out of ten, the certain one-dollar gamble has higher actual utility. All the same, we say that the 10% shot at a million dollars is better (assuming dollars have utility for you) because it has higher expected utility in all cases: $1M multiplied by probability 0.05 > $1 multiplied by probability 1.

F

fitness. See “inclusive fitness.”

formalism. A specific way of logically or mathematically representing something.

function. A relation between inputs and outputs such that every input has exactly one output. A mapping between two sets in which every element in the first set is assigned a single specific element from the second.

G

graph. In graph theory, a mathematical object consisting of simple atomic objects (vertices, or nodes) connected by lines (edges) or arrows (arcs).

H

halting oracle. An abstract agent that is stipulated to be able to reliably answer questions that no algorithm can reliably answer. Though it is provably impossible for finite rule-following systems (e.g., Turing machines) to answer certain questions (e.g., the halting problem), it can still be mathematically useful to consider the logical implications of scenarios in which we could access answers to those questions.

happy death spiral. See “affective death spiral.”

hat tip. A grateful acknowledgment of someone who brought information to one's attention.

hedonic. Concerning pleasure.

heuristic. An imperfect method for achieving some goal. A useful approximation. Cognitive heuristics are innate, humanly universal brain heuristics.

I

idiot god. One of Yudkowsky's pet names for natural selection.

iff. If, and only if.

inclusive fitness. The degree to which a gene causes more copies of itself to exist in the next generation. Inclusive fitness is the property propagated by natural selection. Unlike individual fitness, which is a specific organism’s tendency to promote more copies of its genes, inclusive fitness is held by the genes themselves. Inclusive fitness can sometimes be increased at the expense of the individual organism’s overall fitness.

instrumental value. A goal that is only pursued in order to further some other goal.

Iterated Prisoner’s Dilemma. A series of Prisoner’s Dilemmas between the same two players. Because players can punish each other for defecting on previous rounds, they will usually more reason to cooperate than in the one-shot Prisoner’s Dilemma.

L

Lamarckism. The 19th-century pre-Darwinian hypothesis that populations evolve via the hereditary transmission of the traits practiced and cultivated by the previous generation.

M

Machine Intelligence Research Institute. A small non-profit organization that works on mathematical research related to Friendly AI. Yudkowsky co-founded MIRI in 2000, and is the senior researcher there.

magisterium. Stephen Gould’s term for a domain where some community or field has authority. Gould claimed that science and religion were separate and non-overlapping magisteria. On his view, religion has authority to answer questions of “ultimate meaning and moral value” (but not empirical fact) and science has authority to answer questions of empirical fact (but not meaning or value).

map and territory. A metaphor for the relationship between beliefs (or other mental states) and the real-world things they purport to refer to.

materialism. The belief that all mental phenomena can in principle be reduced to physical phenomena.

Maxwell’s Demon. A hypothetical agent that knows the location and speed of individual molecules in a gas. James Maxwell used this demon in a thought experiment to show that such knowledge could decrease a physical system’s entropy, “in contradiction to the second law of thermodynamics.” The demon’s ability to identify faster molecules allows it to gather them together and extract useful work from them. Leó Szilárd later pointed out that if the demon itself were considered part of the thermodynamic system, then the entropy of the whole would not decrease. The decrease in entropy of the gas would require an increase in the demon’s entropy. Szilárd used this insight to simplify Maxwell’s scenario into a hypothetical engine that extracts work from a single gas particle. Using one bit of information about the particle (e.g., whether it’s in the top half of a box or the bottom half), a Szilárd engine can generate log2(kT) joules of energy, where T is the system’s temperature and k is Boltzmann’s constant.

Maxwell’s equations. In classical physics, a set of differential equations that model the behavior of electromagnetic fields.

meme. Richard Dawkins’s term for a thought that can be spread through social networks.

meta level. A domain that is more abstract or derivative than some domain it depends on, the "object level." A conversation can be said to operate on a meta level, for example, when it switches from discussing a set of simple or concrete objects to discussing higher-order or indirect features of those objects.

metaethics. A theory about what it means for ethical statements to be correct, or the study of such theories. Whereas applied ethics speaks to questions like "Is murder wrong?" and "How can we reduce the number of murders?", metaethics speaks to questions like "What does it mean for something to be wrong?" and "How can we generally distinguish right from wrong?"

minimax. A decision rule for turn-based zero-sum two-player games, where one picks moves that minimize one's opponent’s chance of winning when their moves maximize their chance of winning. This rule is intended to perform well even in worst-case scenarios where one’s opponent makes excellent decisions.

Minimum Message Length Principle. A formalization of Occam’s Razor that judges the probability of a hypothesis based on how long it would take to communicate the hypothesis plus the available data. Simpler hypotheses are favored, as are hypotheses that can be used to concisely encode the data.

MIRI. See “Machine Intelligence Research Institute.”

money pump. A person who is irrationally willing to accept sequences of trades that add up to an expected loss.

monotonic logic. A logic that will always continue to assert something as true if it ever asserted it as true. For example, if “2+2=4” is proved, then in a monotonic logic no subsequent operation can make it impossible to derive that theorem again in the future. In contrast, non-monotonic logics can “forget” past conclusions and lose the ability to derive them.

monotonicity. In mathematics, the property, loosely speaking, of always moving in the same direction (when one moves at all). If I have a preference ordering over outcomes, a monotonic change to my preferences may increase or decrease how much I care about various outcomes, but it won’t change the order -- if I started off liking cake more than cookies, I’ll end up liking cake more than cookies, though any number of other changes may have taken place. Alternatively, a monotonic function can flip all of my preferences. The only option ruled out is for the function to sometimes flip the ordering and sometimes preserve the ordering. A non-monotonic function, then, is one that at least once take an xy, and at least once takes an x>y and outputs x<y.

Moore’s Law. The observation that technological progress has enabled engineers to double the number of transistors they can fit on an integrated circuit approximately every two years from the 1960s to the 2010s. Other exponential improvements in computing technology (some of which have also been called “Moore’s Law”) may continue to operate after the end of the original Moore’s Law. The most important of these is the doubling of available computations per dollar. The futurist Ray Kurzweil has argued that the latter exponential trend will continue for many decades, and that this trend will determine rates of AI progress.

motivated cognition. Reasoning and perception that is driven by some goal or emotion of the reasoner that is at odds with accuracy. Examples of this include non-evidence-based inclinations to reject a claim (motivated skepticism), to believe a claim (motivated credulity), to continue evaluating an issue (motivated continuation), or to stop evaluating an issue (motivated stopping).

Murphy’s law. The saying “Anything that can go wrong will go wrong.”

mutual information. For two variables, the amount that knowing about one variable tells you about the other's value. If two variables have zero mutual information, then they are independent; knowing the value of one does nothing to reduce uncertainty about the other.

LESSWRONG
LW

LESSWRONG
LW

A

B

C

D

E

F

G

H

I

L

M

R:A-Z Glossary

A

B

C

D

E

F

G

H

I

L

M