Updateless Decision Theory (UDT) is a decision theory meant to deal with a fundamental problem in the existing decision theories: dynamic consistencyinconsistency, IE, having consistentconflicting desires over time. In behavioral economics, humans are often modeled as hyperbolic discounters, meaning that rewards further away in time are seen as proportionately less important (so getting $200 one week from now is as good as $100 two weeks from now). This is dynamically inconsistent because the relative value of rewards changes as they get closer or further away in time. (Getting $200 one year from now sounds about the same as getting $100 one year plus one week from now.) This model explains some human behaviors, such as snoozing alarms repeatedly. (Getting up early to get a good start on the day seems appealing the previous evening, but when the alarm rings, the relative reward of sleeping in another few minutes is larger.)[1]
The dynamic inconsistency inherent in hyperbolic discounting can be fixed by exponential discounting, amongst other possibilities. However, dynamic inconsistencies can still occur for other reasons. The two most common decision theories today, Causal Decision Theory (CDT) and Evidential Decision Theory (EDT), are both dynamically inconsistent about Counterfactual Mugging: they refuse Omega when faced with the problem, but if asked beforehand, would see the value of agreeing.[2][3]
However, UDT isn't only about rejection of the subgame-perfect condition. UDT also rejects CDT's way of thinking about the consequences of actions. In Judea Pearl's definition of causality,[1]4] CDT ignores any causal links inbound to the decider, treating this agent as an uncaused cause. UDT rejects this idea, instead thinking about consequences in the way EDT does.
Let O be a random variable representing observations, and o be some particular value (some specific observations). P() is the prior probability distribution. U is a random variable representing the utility. E is the expectation operator. There is a set of possible actions, A. EDT recommends the following action:[2]5]
Getting up early to get a good start on the day seems appealing the previous evening, but when the alarm rings, the relative reward of sleeping in another few minutes is larger.
We can more rigorously define dynamic inconsistency as follows:
If the agent is given the opportunity to commit to a decision early, there are cases where it strictly prefers a different choice than the one it would make in-the-moment.
In Counterfactual Mugging, we understand Omega as "making a copy" of the agent at some point in time (EG, taking a detailed scan for use in a simulation). If a CDT agent is given the opportunity to commit to a decision in Counterfactual Mugging before this point in time, then it will think of the simulation as being downstream of its decision, so it will make the same decision as UDT. If a CDT agent is
Updateless Decision Theory (UDT) is a decision theory meant to deal with a fundamental problem in the existing decision theories: dynamic consistency, IE, having consistent desires over time. In behavioral economics, humans are often modeled as hyperbolic discounters, meaning that rewards further away in time are seen as proportionately less important (so getting $200 one week from now is as good as $100 two weeks from now). This is dynamically inconsistent because the needrelative value of rewards changes as they get closer or further away in time. (Getting $200 one year from now sounds about the same as getting $100 one year plus one week from now.) This model explains some human behaviors, such as snoozing alarms repeatedly. (Getting up early to treatget a good start on the agent as a partday seems appealing the previous evening, but when the alarm rings, the relative reward of the worldsleeping in which it makes its decisions. In contrast,another few minutes is larger.)
The dynamic inconsistency inherent in thehyperbolic discounting can be fixed by exponential discounting, amongst other possibilities. However, dynamic inconsistencies can still occur for other reasons. The two most common decision theorytheories today, Causal Decision Theory (CDT) and Evidential Decision Theory (EDT), are both dynamically inconsistent about Counterfactual Mugging: they refuse Omega when faced with the deciding agent is not partproblem, but if asked beforehand, would see the value of the world model—its decision is the output of the CDT, but the agent's decision in the world context is "magic": in the moment of deciding, no causal links feed into its chosen action. It acts as though its decision was causeless, as in some dualist free-will theories.agreeing.
Getting this issue right is critical in building a self-improvingsafe artificial general intelligence, as such an AI must analyze its own behavior and that of a next generation that it may build. Dynamically inconsistent AI systems have an incentive to engage in self-modification, but such self-modification is inherently risky.
UDT specifies that the optimal agent is the one with the best algorithm—policy—the best mapping from observations to actions—across a probability distribution of all world-histories.as estimated by its prior beliefs. ("Best" here, as in other decision theories, means one that maximizes a utility/reward function.expected utility.)
This definition may seem trivial, but in contrast, CDT and EDT both choose the "best" action in the current moment, IE, according to the posterior beliefs.
For example, standard game theory (which uses CDT) says that an agentfollowing through on costly threats is irrational. For example, suppose Alice says that she will hunt down Bob and beat him up if Bob steals from her. Bob proceeds to steal a small amount from Alice. CDT says that Alice should chooselet it go, rather than pay the...
Updateless Decision Theory (UDT) is a decision theory meant to deal with a fundamental problem in the existing decision theories: dynamic inconsistency, IE, having conflicting desires over time. In behavioral economics, humans are often modeled as hyperbolic discounters, meaning that rewards further away in time are seen as proportionately less important (so getting $200$100 one week from now is as good as $100$200 two weeks from now). This is dynamically inconsistent because the relative value of rewards changes as they get closer or further away in time. (Getting $200$100 one year from now sounds about the same asmuch less desirable than getting $100$200 one year plus one week from now.) This model explains some human behaviors, such as snoozing alarms repeatedly.[1]
A robust theory of logical uncertainty is essential to a full formalization of UDT. A UDT agent must calculate probabilities and expected values on the outcome of its possible actions in all possible worlds--sequences of observations and its own actions. However, it does not know its own actions in all possible worlds. (The whole point is to derive its actions.) On the other hand, it does have some knowledge about its actions, just as you know that you are unlikely to walk straight into a wall the next chance you get. So, the UDT agent models itself as an algorithm, and its probability distribution about what it itself will do is an important input into its maximization calculation.
UDT is very similar to Functional Decision Theory (FDT), but there are differences. FDT doesn't include the UDT1.1 fix and Nate SoarsSoares states: "Wei Dai doesn't endorse FDT's focus on causal-graph-style counterpossible reasoning; IIRC he's holding out for an approach to counterpossible reasoning that falls out of evidential-style conditioning on a logically uncertain distribution". Rob BesingerBensinger says that he's heard UDT described as "FDT + a theory of anthropics".
Updateless Decision Theory (UDT) is a decision theory meant to deal with a fundamental problem in the existing decision theories: the need to treat the agent as a part of the world in which it makes its decisions. In contrast, in the most common decision theory today, Causal Decision Theory (CDT), the deciding agent is not part of the world model--model—its decision is the output of the CDT, but the agent's decision in the world context is "magic": in the moment of deciding, no causal links feed into its chosen action. It acts as though its decision was causeless, as in some dualist free-will theories.
UDT specifies that the optimal agent is the one with the best algorithm--algorithm—the best mapping from observations to actions--actions—across a probability distribution of all world-histories. ("Best" here, as in other decision theories, means one that maximizes a utility/reward function.)
This definition may seem trivial, but in contrast, CDT says that an agent should choose the best *option* at any given moment, based on the effects of that action. As in Judea Pearl's definition of causality, CDT ignores any causal links inbound to the decider, treating this agent as an uncaused cause. The agent is unconcerned about what evidence its decision may provide about the agent's own mental makeup--makeup—evidence which may suggest that the agent will make suboptimal decisions in other cases.
A robust theory of logical uncertainty is essential to a full formalization of UDT. A UDT agent must calculate probabilities and expected values on the outcome of its possible actions in all possible worlds--worlds—sequences of observations and its own actions. However, it does not know its own actions in all possible worlds. (The whole point is to derive its actions.) On the other hand, it does have some knowledge about its actions, just as you know that you are unlikely to walk straight into a wall the next chance you get. So, the UDT agent models itself as an algorithm, and its probability distribution about what it itself will do is an important input into its maximization calculation.
One valuable insight from EDT is reflected in "UDT 1.1" (see the article by McAllister in references), a variant of UDT in which the agent takes into account that some of its algorithm (mapping from observations to actions) may be prespecified and not entirely in its control, so that it has to gather evidence and draw conclusions about part of its own mental makeup. The difference between UDT 1.0 and 1.1 is that UDT 1.1 iterates over policies, whereas UDT 1.0 iterates over actions.
Both UDT and Timeless Decision Theory (TDT) make decisions on the basis of what you would have pre-committed to. The difference is that UDT asks what you would have pre-committed to without the benefit of any observations you have made about the universe, while TDT asks what you would have pre-committed to givegiven all information you've observed so far. This means that UDT pays in Counterfactual Mugging, while TDT does not.
One valuable insight from EDT is reflected in "UDT 1.1" (see article by McAllister in references), a variant of UDT in which the agent takes into account that some of its algorithm (mapping from observations to actions) may be prespecified and not entirely in its control, so that it has to gather evidence and draw conclusions about part of its own mental makeup. The difference between UDT 1.0 and 1.1 is that UDT 1.1 iterates over policies, whereas UDT 1.0 iterates over actions.