**Backward chaining**. Backward chaining is an inference method described colloquially as working backward from the goal.**Base rate**. In probability and statistics, the base rate (also known as prior probabilities) is the class of probabilities unconditional on "featural evidence" (likelihoods).**Bayes's Theorem**. The equation stating how to update a hypothesis*H*in light of new evidence*E*. In its simplest form, Bayes's Theorem says that a hypothesis' probability given the evidence, written*P(H|E)*, equals the likelihood of the evidence given that hypothesis, multiplied by your prior probability*P(H)*that the hypothesis was true, divided by the prior probability*P(E)*that you would see that evidence regardless. I.e.:*P(H|E) = P(E|H) P(H) / P(E)*.

Also known as Bayes's Rule. See "odds ratio" for a simpler way to calculate a Bayesian update.**Bayesian**. (a) Optimally reasoned; reasoned in accordance with the laws of probability. (b) An optimal reasoner, or a reasoner that approximates optimal inference unusually well. (c) Someone who treats beliefs as probabilistic and treats probability theory as a relevant ideal for evaluating reasoners. (d) Related to probabilistic belief. (e) Related to Bayesian statistical methods.**Bayesian updating**. Revising your beliefs in a way that's fully consistent with the information available to you. Perfect Bayesian updating is wildly intractable in realistic environments, so real-world agents have to rely on imperfect heuristics to get by. As an optimality condition, however, Bayesian updating helps make sense of the idea that some ways of changing one's mind work better than others for learning about the world.. Japanese for "Bayes user." A fictional order of high-level rationalists, also known as the Bayesian Conspiracy.**beisutsukai****Bell's Theorem**. Bell's theorem is a term encompassing a number of closely related results in physics, all of which determine that quantum mechanics is incompatible with noncontextual local hidden-variable theories, given some basic assumptions about the nature of measurement.**Berkeleian idealism**. The belief, espoused by George Berkeley, that things only exist in various minds (including the mind of God).**bias**. (a) A cognitive bias. In*Rationality: From AI to Zombies*, this will be the default meaning. (b) A statistical bias. (c) An inductive bias. (d) Colloquially: prejudice or unfairness.**bit**. (a) A binary digit, taking the value 0 or 1. (b) The logarithm (base 1/2) of a probability—the maximum information that can be communicated using a binary digit, averaged over the digit's states.*Rationality: From AI to Zombies*usually uses "bit" in the latter sense.**black box**. Any process whose inner workings are mysterious or poorly understood.**Black Swan**. The black swan theory or theory of black swan events is a metaphor that describes an event that comes as a surprise, has a major effect, and is often inappropriately rationalized after the fact with the benefit of hindsight.**blind god**. One of Yudkowsky's pet names for

**a priori**. Before considering the evidence. Similarly, "a posteriori" means "after considering the evidence"; compare prior and posterior probabilities.

In philosophy, "a priori" often refers to the stronger idea of something knowable in the absence of*any*experiential evidence (outside of the evidence needed to understand the claim).**affect heuristic**. People's general tendency to reason based on things' felt goodness or badness.**affective death spiral**. Yudkowsky's term for a halo effect that perpetuates and exacerbates itself over time.**AGI**.*See “artificial general intelligence.”***AI-Box Experiment**. A demonstration by Yudkowsky that people tend to overestimate how hard it is to manipulate people, and therefore underestimate the risk of building an Unfriendly AI that can only interact with its environment by verbally communicating with its programmers. One participant role-plays an AI, while another role-plays a human whose job it is interact with the AI without voluntarily releasing the AI from its “box”. Yudkowsky and a few other people who have role-played the AI have succeeded in getting the human supervisor to agree to release them, which suggests that a superhuman intelligence would have an even easier time escaping.**akrasia**. Akrasia is a lack of self-control or acting against one's better judgment.**alien god**. One of Yudkowsky's pet names for natural selection.**ambiguity aversion**. Preferring small certain gains over much larger uncertain gains.**amplitude**. A quantity in a configuration space, represented by a complex number. Many sources misleadingly refer to quantum amplitudes as "probability amplitudes", even though they aren't probabilities. Amplitudes are physical, not abstract or formal. The complex number’s modulus squared (i.e., its absolute value multiplied by itself) yields the Born probabilities, but the reason for this is unknown.**amplitude distribution**.*See “wavefunction.”***anchoring**. The cognitive bias of relying excessively on initial information after receiving relevant new information.**anthropics**. Problems related to reasoning well about how many observers like you there are.**artificial general intelligence**. Artificial intelligence that is "general-purpose" in the same sense that human reasoning is general-purpose. It's hard to crisply state what this kind of reasoning consists in—if we knew how to fully formalize it, we would already know how to build artificial general intelligence. However, we can gesture at (e.g.) humans' ability to excel in many different scientific fields, even though we did not evolve in an ancestral environment containing particle accelerators.**Aumann's Agreement Theorem**.**availability heuristic**. The tendency to base judgments on how easily relevant examples come to mind.**average utilitarianism**. Average utilitarianism**values the maximization of the average utility among a group's members**. So a group of 100 people each with 100 hedons (or "happiness points") is judged as preferable to a group of 1,000 people with 99 hedons each.

**Backward chaining**. Backward chaining is an inference method described colloquially as working backward from the goal.**Base rate**.**Bayes's Theorem**. The equation stating how to

**Unfriendly AI**. A hypothetical smarter-than-human artificial intelligence that causes a global catastrophe by pursuing a goal without regard for humanity’s well-being. Yudkowsky predicts that superintelligent AI will be “Unfriendly” by default, unless a special effort goes into researching how to give AI stable, known, humane goals. Unfriendliness doesn’t imply malice, anger, or other human characteristics; a completely impersonal optimization process can be “Unfriendly” even if its only goal is to make paperclips. This is because even a goal as innocent as ‘maximize the expected number of paperclips’ could motivate an AI to treat humans as competitors for physical resources, or as threats to the AI’s aspirations.**uniform probability distribution**. A distribution in which all events have equal probability; a maximum-entropy probability distribution.**universal Turing machine**. A Turing machine that can compute all Turing-computable functions. If something can be done by any Turing machine, then it can be done by every universal Turing machine. A system that can in principle do anything a Turing machine could is called “Turing-complete."**updating**. Revising one’s beliefs. See also "Bayesian updating."**utilitarianism**. An ethical theory asserting that one should act in whichever manner causes the most benefit to people, minus how much harm results. Standard utilitarianism argues that acts can be justified even if they are morally counter-intuitive and harmful, provided that the benefit outweighs the harm.**utility function**. A function that ranks outcomes by "utility," i.e., by how well they satisfy some set of goals or constraints. Humans are limited and imperfect reasoners, and don't consistently optimize any endorsed utility function; but the idea of optimizing a utility function helps us give formal content to "what it means to pursue a goal well," just as Bayesian updating helps formalize "what it means to learn well."**utilon**. Yudkowsky’s name for a unit of utility, i.e., something that satisfies a goal. The term is deliberately vague, to permit discussion of desired and desirable things without relying on imperfect proxies such as monetary value and self-reported happiness.

•

Created by Rob Bensinger at

**Peano arithmetic**.

**superposition**.

**information**. (a) Colloquially,~~anything~~any fact or data that helps someone better understand something. (b) In information theory, how surprising, improbable, or complex something is. E.g., there is more information in seeing a fair eight-sided die come up "4" than in seeing a fair six-sided die come up "4," because the former event has probability 1/8 while the latter has probability 1/6.

We can also speak of the average information in a fair six- or eight-sided die roll in general, before seeing the actual number; this a die roll's*expected*amount of information, which is called its "entropy." If a fair die has more sides, then rolling such a die will have more entropy because on average, the outcome of the roll has more information (i.e., lower probability).

When knowing about one variable can help narrow down the value of another variable, the two variables are said to have*mutual*information.

**intuition pump**.

**intuitionistic logic**. An approach to logic that rejects the law of the excluded middle, "Every statement is true or false."

**Laplace's Law of Succession**.

**lookup table**.

**marginal utility**.

**marginal variable**.

**economies of scale**.

**fungible**.

**game theory**.

**hyper-real number**.

**information**. (a) Colloquially, anything that helps someone better understand something. (b) In information theory, how surprising, improbable, or complex something is. E.g., there is more information in seeing a fair eight-sided die come up "4" than in seeing a fair six-sided die come up "4," because the former event has probability 1/8 while the latter has probability 1/6.

We can also speak of the average information in a fair six- or eight-sided die roll in general, before seeing the actual number; this a die roll's*expected*amount of information, which is called its "entropy." If a fair die has more sides, then rolling such a die will have more entropy because on average, the outcome of the roll has more information (i.e., lower probability).

When knowing about one variable can help narrow down the value of another variable, the two variables are said to have*mutual*information.

Rob Bensinger v1.8.0 (+184/-9357) moving entries to the talk page that are some combination of 'pretty obvious' and/or 'pretty unimportant' 2

~~ad hominem~~~~. A verbal attack on the person making an argument, where a direct criticism of the argument is possible and would be more relevant. The term is reserved for cases where talking about the person amounts to changing the topic. If your character is the topic from the outset (e.g., during a job interview), then it isn't an ad hominem fallacy to cite evidence showing that you're a lousy worker.~~

~~algorithm~~~~. A specific procedure for computing some function. A mathematical object consisting of a finite, well-defined sequence of steps that concludes with some output determined by its initial input. Multiple physical systems can simultaneously instantiate the same algorithm.~~

~~anthropomorphism~~~~. The tendency to assign human qualities to non-human phenomena.~~

.~~ASCII~~Aumann's Agreement Theorem~~The American Standard Code for Information Exchange. A very simple system for encoding 128 ordinary English letters, numbers, and punctuation.~~

**Backward chaining**.

**Base rate**.

**Black Swan**.

~~bucket~~~~.~~~~See “pebble and bucket.”~~

**conditional independence**.

**conditional probability**. The probability that a statement is true on the assumption that some other statement is true. E.g., the conditional probability*P(A|B)*means "the probability of A given that B."

**consequentialism**. (a) The ethical theory that the moral rightness of actions depends only on~~those actions' consequences.~~what outcomes result. Consequentialism is normally contrasted with ideas like deontology, which says that morality is about following certain rules (e.g., "don't lie") regardless of the consequences. (b) Yudkowsky's term for any reasoning process that selects actions based on their consequences.

**Cox's Theorem**.

. Entirely new; produced from scratch.*de novo*

**decibel**.

~~econblog~~~~. Economics blog.~~

~~evolution~~~~. (a) In biology, change in a population’s heritable features. (b) In other fields, change of any sort.~~

~~formalism~~~~. A specific way of logically or mathematically representing something.~~

~~function~~~~. A relation between inputs and outputs such that every input has exactly one output. A mapping between two sets in which every element in the first set is assigned a single specific element from the second.~~

~~hat tip~~~~. A grateful acknowledgment of someone who brought information to one's attention.~~

~~idiot god~~~~. One of Yudkowsky's pet names for natural selection.~~

~~iff~~~~. If, and only if.~~

~~Lamarckism~~~~. The 19th-century pre-Darwinian hypothesis that populations evolve via the hereditary transmission of the traits practiced and cultivated by the previous generation.~~

~~Machine Intelligence Research Institute~~~~. A small non-profit organization that works on mathematical research related to Friendly AI. Yudkowsky co-founded MIRI in 2000, and is the senior researcher there.~~

~~Maxwell’s equations~~~~. In classical physics, a set of differential equations that model the behavior of electromagnetic fields.~~

~~meme~~~~. Richard Dawkins’ term for a thought that can be spread through social networks.~~

~~minimax~~~~. A decision rule for turn-based zero-sum two-player games, where one picks moves that minimize one's opponent’s chance of winning when their moves maximize their chance of winning. This rule is intended to perform well even in worst-case scenarios where one’s opponent makes excellent decisions.~~

~~MIRI~~~~.~~

**AGI**.*See “artificial general intelligence.”*

**akrasia**.

**amplitude**. A quantity in a configuration space, represented by a complex number. Many sources misleadingly refer to quantum amplitudes as "probability amplitudes", even though they aren't probabilities. Amplitudes are physical, not abstract or formal. The complex number’s modulus squared (i.e., its absolute value multiplied by itself) yields the Born probabilities, but the reason for this is unknown.

**amplitude distribution**.*See “wavefunction.”*

**average utilitarianism**.

**Bell's Theorem**.

**Born rule**.

**collapse**.

**complex**. (a) Colloquially, something with many parts arranged in a relatively specific way. (b) In information theory, something that's relatively hard to formally specify and that thereby gets a larger penalty under Occam's razor; measures of this kind of complexity include Kolmogorov complexity. (c) Complex-valued, i.e., represented by the sum of a real number and an imaginary number.

**configuration space**.

**consequentialism**. (a) The ethical theory that the rightness of actions depends only on those actions' consequences. Consequentialism is normally contrasted with ideas like deontology, which says that morality is about following certain rules (e.g., "don't lie") regardless of the consequences. (b) Yudkowsky's term for any reasoning process that selects actions based on their consequences.

**Copenhagen Interpretation**.

**decoherence**.

**directed acyclic graph**. A graph that is directed (its edges have a direction associated with them) and acyclic (there's no way to follow a sequence of edges in a given direction to loop around from a node back to itself).

**dukkha**.

**Dutch book**.

**eudaimonia**.

**Eurisko**.

**FAI**.*See “friendly AI.”*

**falsificationism**.

**Fun Theory**.

**gray goo**.

**Gricean implication**.

**Mind Projection Fallacy**.

**one-boxing**. Taking only the opaque box in Newcomb's Problem.

**probability amplitude**.*See “amplitude.”*

**uniform probability distribution**. A distribution in which all events have equal probability; a maximum-entropy probability distribution.

**affect heuristic**. People's general tendency to reason based on things' felt goodness or badness.

**affective death spiral**.~~A~~Yudkowsky's term for a halo effect that perpetuates and exacerbates itself over time.

**causal graph**. A directed acyclic graph in which an arrow going from node A to node B is interpreted as "changes in A can directly cause changes in B."

**correspondence bias**. Drawing conclusions about someone's unique disposition from behavior that can be entirely explained by the situation in which it occurs. When we see someone else kick a vending machine, we think they are "an angry person," but when we kick the vending machine, it's because the bus was late, the train was early, and the machine ate our money.

**cryonics**. The low-temperature preservation of brains. Cryonics proponents argue that cryonics should see more routine use for people whose respiration and blood circulation have recently stopped (i.e., people who qualify as clinically deceased), on the grounds that future medical technology may be able to revive such people.

**epistemology**. (a) A world-view or approach to forming beliefs. (b) The study of knowledge.

**existential risk**. Something that threatens to permanently and drastically reduce the value of the future, such as stable global totalitarianism or human extinction.

**frequentism**. (a) The view that the Bayesian approach to probability—i.e., treating probabilities as belief states—is unduly subjective. Frequentists instead propose treating probabilities as frequencies of events. (b) Frequentist statistical methods.

**Friendly AI**. Artificial general intelligence systems that are safe and useful. "Friendly" is a deliberately informal descriptor, intended to signpost that "Friendliness" still has very little technical content and needs to be further developed. Although this remains true in many respects as of this writing (2018), Friendly AI research has become much more formally developed since Yudkowsky coined the term "Friendly AI" in 2001, and the research area is now more often called "AI alignment research."

**group selection**. Natural selection at the level of groups, as opposed to individuals. Historically, group selection used to be viewed as a more central and common part of evolution—evolution was thought to frequently favor self-sacrifice "for the good of the species."

**halo effect**. The tendency to assume that something good in one respect must be good in other respects.

**humility**. Not being arrogant or overconfident. Yudkowsky defines humility as "taking specific actions in anticipation of your own errors." He contrasts this with "modesty," which he views as a social posture for winning others' approval or esteem, rather than as a form of epistemic humility.

**intelligence explosion**. A scenario in which AI systems rapidly improve in cognitive ability because they see fast, consistent, sustained returns on investing work into such improvement. This could happen via AI systems using their intelligence to rewrite their own code, improve their hardware, or acquire more hardware, then

**élan vital**. "Vital force." A term coined in 1907 by the philosopher Henri Bergson to refer to a mysterious force that was held to be responsible for life's "aliveness" and goal-oriented behavior.

**entropy**. (a) In thermodynamics, the number of different ways a physical state may be produced (its Boltzmann entropy). E.g., a slightly shuffled deck has lower entropy than a fully shuffled one, because there are many more configurations a fully shuffled deck is likely to end up in. (b) In information theory, the expected value of the information contained in a message (its Shannon entropy). That is, a random variable’s Shannon entropy is how many bits of information one would be missing (on average) if one did not know the variable’s value.

Boltzmann entropy and Shannon entropy have turned out to be equivalent; that is, a system’s thermodynamic disorder corresponds to the number of bits needed to fully characterize it.

**Everett branch**. A "world" in the many-worlds interpretation of quantum mechanics.

**expected utility**.~~A measure~~The expected value of a utility function given some action. Roughly: how much an agent’s goals will tend to be satisfied by some~~decision,~~action, given uncertainty about the~~decision’~~action's outcome.~~Accepting~~

A sure $1 will usually lead to more utility than a~~5%~~10% chance of~~winning a million dollars will usually leave you poorer than accepting a 100% chance of winning one dollar; nine times out of ten, the certain one-dollar gamble has higher~~~~actual~~~~utility. All the same, we say that~~$1 million. Yet in all cases, the 10% shot at~~a~~$1 million~~dollars is better (assuming dollars have utility for you) because it~~has~~higher~~more*expected*utility, assuming you assign more than ten times as much utility~~in~~to winning $1 million. Expected utility is an idealized mathematical framework for making sense of the idea "good bets don't have to be sure bets."

**expected value**. The sum of all~~cases: $1M~~possible values of a variable, each multiplied by its probability~~0.05 > $1 multiplied by probability 1.~~of being the true value.

**Fermi paradox**. The puzzle of reconciling "on priors, we should expect there to be many large interstellar civilizations visible in the night sky" and "we see no clear signs of such civilizations."

Some reasons many people find it puzzling that there are no visible alien civilizations include: "the elements required for life on Earth seem commonplace"; "life had billions of years to develop elsewhere before we evolved"; "high intelligence seems relatively easy to evolve (e.g., many of the same cognitive abilities evolved independently in humans, octopuses, crows)"; and "although some goals favor hiddenness, many different possible goals favor large-scale extraction of resources, and we only require there to exist one old species of the latter type."

**foozality**. See "rationality."

**graph**. In graph theory, a

**a priori**.~~A sentence that is reasonable~~Before considering the evidence. Similarly, "a posteriori" means "after considering the evidence"; compare prior and posterior probabilities.

In philosophy, "a priori" often refers to~~believe even~~the stronger idea of something knowable in the absence of*any*experiential evidence (outside of the evidence needed to*understand*the~~sentence)~~claim).~~A priori claims are in some way introspectively self-evident, or justifiable using only abstract reasoning. For example, pure mathematics is often claimed to be a priori, while scientific knowledge is claimed to be~~~~a posteriori~~~~, or dependent on (sensory) experience. These two terms shouldn’t be confused with prior and posterior probabilities.~~

**anthropics**. Problems related to reasoning well about how many observers like you there are.

**artificial general intelligence**. Artificial intelligence that is "general-purpose" in the same sense that human reasoning is general-purpose. It's hard to crisply state what this kind of reasoning consists in—if we knew how to fully formalize it, we would already know how to build artificial general intelligence. However, we can gesture at (e.g.) humans' ability to excel in many different scientific fields, even though we did not evolve in an ancestral environment containing particle accelerators.

**availability heuristic**. The tendency to base judgments on how easily relevant examples come to mind.

**Bayes's Theorem**. The equation stating how to update a hypothesis*H*in light of new evidence*E*. In its simplest form, Bayes's Theorem says that a hypothesis' probability given the evidence, written*P(H|E)*, equals the likelihood of the evidence given that hypothesis, multiplied by your prior probability*P(H)*that the hypothesis was true, divided by the prior probability*P(E)*that you would see that evidence regardless. I.e.:*P(H|E) = P(E|H) P(H) / P(E)*.

Also known as Bayes's Rule. See "odds ratio" for a simpler way to calculate a Bayesian update.

**Bayesian**. (a) Optimally reasoned; reasoned in accordance with the laws of probability. (b) An optimal reasoner, or a reasoner that approximates optimal inference unusually well. (c) Someone who treats beliefs as probabilistic and treats probability theory as a relevant ideal for evaluating reasoners. (d) Related to probabilistic belief. (e) Related to Bayesian statistical methods.

**Bayesian updating**. Revising your beliefs in a way that's fully consistent with the information available to you. Perfect Bayesian updating is wildly intractable in realistic environments, so real-world agents have to rely on imperfect heuristics to get by. As an optimality condition, however, Bayesian updating helps make sense of the idea that some ways of changing one's mind work better than others for learning about the world.

**bit**. (a) A binary digit, taking the value 0 or 1. (b) The logarithm (base 1/2) of a probability—the maximum information that can be communicated using a binary digit, averaged over the digit's states.*Rationality: From AI to Zombies*usually uses "bit"

**rationalist**. A person interested in rationality, especially one who is attempting to use new insights from psychology and the formal sciences to become more rational.

**rationality**. The property of employing useful cognitive procedures. Making systematically good decisions (instrumental rationality) based on systematically accurate beliefs (epistemic rationality).

**recursion**. A sequence of similar actions that each build on the result of the previous action.

**reductio ad absurdum**. Refuting a claim by showing that it entails a claim that is more obviously false.

**reduction**. An explanation of a phenomenon in terms of its origin or parts, especially one that allows you to redescribe the phenomenon without appeal to your previous conception of it.

**reductionism**. (a) The practice of scientifically reducing complex phenomena to simpler underpinnings. (b) The belief that such reductions are generally possible.

**representativeness heuristic**. A cognitive heuristic where one judges the probability of an event based on how well it matches some mental prototype or stereotype.

**Ricardo’s Law of Comparative Advantage**.*See “comparative advantage.”*

**satori**. In Zen Buddhism, a non-verbal, pre-conceptual apprehension of the ultimate nature of reality.

**Schrödinger equation**. A fairly simple partial differential equation that defines how quantum wavefunctions evolve over time. This equation is deterministic; it is not known why the Born rule, which converts the wavefunction into an experimental prediction, is probabilistic, though there have been many attempts to make headway on that question.

**scope insensitivity**. A cognitive bias where large changes in an important value have little or no effect on one's behavior.

**screening off**. Making something informationally irrelevant. A piece of evidence A screens off a piece of evidence B from a hypothesis C if, once you know about A, learning about B doesn’t affect the probability of C.

**search tree**. A graph with a root node that branches into child nodes, which can then either terminate or branch once more. The tree data structure is used to locate values; in chess, for example, each node can represent a move, which branches into the other player’s possible responses, and searching the tree is intended to locate winning sequences of moves.

**self-anchoring**. Anchoring to oneself. Treating one’s own qualities as the default, and only weakly updating toward viewing others as different when given evidence of differences.

**separate magisteria**.*See “magisterium.”*

**sequences**. Yudkowsky’s name for short series of thematically linked blog posts or essays.

**set theory**. The study of relationships between abstract collections of objects, with a focus on collections of other collections. A branch of mathematical logic frequently used as a foundation for other mathematical fields.

**Shannon entropy**.*See “entropy.”*

**Shannon mutual information**.*See “mutual information.”*

**Simulation Hypothesis**. The hypothesis that the world as we know it is a computer program designed by some powerful intelligence. An idea popularized in the movie*The Matrix*, and discussed more seriously by the philosopher Nick Bostrom.

**Singularity**. One of

`__TOC__`

~~artificial neural network~~~~.~~~~See “neural network.”~~

**bias**. (a) A cognitive bias. In*Rationality:*, this will be the default meaning. (b) A statistical bias. (c) An inductive bias. (d) Colloquially: prejudice or unfairness.~~A~~From AI to~~Z~~Zombies

**nanotechnology**. Technologies based on the fine-grained control of matter on a scale of molecules, or smaller. If known physical law (or the machinery inside biological cells) is any guide, it should be possible in the future to design nanotechnological devices that are much faster and more powerful than any extant machine.

**Nash equilibrium**. A situation in which no individual would benefit by changing their own strategy, assuming the other players retain their strategies. Agents often converge on Nash equilibria in the real world, even when they would be much better off if*multiple*agents simultaneously switched strategies. For example, mutual defection is the only Nash equilibrium in the standard one-shot Prisoner’s Dilemma (i.e., it is the only option such that neither player could benefit by changing strategies while the other player’s strategy is held constant), even though it is not Pareto-optimal (i.e., each player would be better off if the*group*behaved differently).

**natural selection**. The process by which heritable biological traits change in frequency due to their effect on how much their bearers reproduce.

**negentropy**. Negative entropy. A useful concept because it allows one to think of thermodynamic regularity as a limited resource one can possess and make use of, rather than as a mere absence of entropy.

**Neutral Point of View**. A policy used by the online encyclopedia Wikipedia to instruct users on how they should edit the site’s contents. Following this policy means reporting on the different positions in controversies, while refraining from weighing in on which position is correct.

**Newcomb’s Problem**. A central problem in decision theory. Imagine an agent that understands psychology well enough to predict your decisions in advance, and decides to either fill two boxes with money, or fill one box, based on their prediction. They put $1,000 in a transparent box no matter what, and they then put $1 million in an opaque box if (and only if) they predicted that you’d*only*take the opaque box. The predictor tells you about this, and then leaves. Which do you pick? If you take both boxes, you get only the $1000, because the predictor foresaw your choice and didn’t fill the opaque box. On the other hand, if you only take the opaque box, you come away with $1M. So it seems like you should take only the opaque box. However, many people object to this strategy on the grounds that you can’t causally control what the predictor did in the past; the predictor has already made their decision at the time when you

This is a list of brief explanations and definitions for terms that Eliezer Yudkowsky uses in the book ** Rationality: From AI to Zombies**, an edited version of the Sequences.

The glossary is a community effort, and you're welcome to improve on the entries here, or add new ones. See the Talk page for some ideas for unwritten entries.

**a priori**. A sentence that is reasonable to believe even in the absence of any experiential evidence (outside of the evidence needed to*understand*the sentence). A priori claims are in some way introspectively self-evident, or justifiable using only abstract reasoning. For example, pure mathematics is often claimed to be a priori, while scientific knowledge is claimed to be*a posteriori*, or dependent on (sensory) experience. These two terms shouldn’t be confused with prior and posterior probabilities.

**ad hominem**. A verbal attack on the person making an argument, where a direct criticism of the argument is possible and would be more relevant. The term is reserved for cases where talking about the person amounts to changing the topic. If your character is the topic from the outset (e.g., during a job interview), then it isn't an ad hominem fallacy to cite evidence showing that you're a lousy worker.

**affective death spiral**. A halo effect that perpetuates and exacerbates itself over time.

**AI-Box Experiment**. A demonstration by Yudkowsky that people tend to overestimate how hard it is to manipulate people, and therefore underestimate the risk of building an Unfriendly AI that can only interact with its environment by verbally communicating with its programmers. One participant role-plays an AI, while another role-plays a human whose job it is interact with the AI without voluntarily releasing the AI from its “box”. Yudkowsky and a few other people who have role-played the AI have succeeded in getting the human supervisor to agree to release them, which suggests that a superhuman intelligence would have an even easier time escaping.

**algorithm**. A specific procedure for computing some function. A mathematical object consisting of a finite, well-defined sequence of steps that concludes with some output determined by its initial input. Multiple physical systems can simultaneously instantiate the same algorithm.

**alien god**. One of Yudkowsky's pet names for natural selection.

**ambiguity aversion**. Preferring small certain gains over much larger uncertain gains.

**amplitude**. A quantity in a configuration space, represented by a complex number. Amplitudes are physical, not abstract or formal. The complex number’s modulus squared (i.e., its absolute value multiplied by itself) yields the Born probabilities, but the reason for this is unknown.

**anchoring**. The cognitive bias of relying excessively on initial information after receiving relevant new information.

**anthropomorphism**. The tendency to assign human qualities to non-human phenomena.

**artificial neural network**.*See “neural network.”*

**ASCII**. The American Standard Code for Information Exchange. A

Anki deck.