Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort. My thanks to Eric Chen, Elliott Thornley, and John Wentworth for invaluable discussion and comments on earlier drafts. All errors are mine.
This article presents a few theorems about the invulnerability of agents with incomplete preferences. Elliott Thornley’s (2023) proposed approach to the AI shutdown problem relies on these preferential gaps, but John Wentworth and David Lorell have argued that they make agents play strictly dominated strategies.[1] I claim this is false.
Summary
Suppose there exists a formal description of an agent that willingly shuts down when a certain button is pressed. Elliott Thornley’s (2023) Incomplete Preference Proposal aims to offer such a description. It’s plausible that, for it to constitute a promising approach to solving the AI shutdown problem, this description also needs to (i) permit the agent to be broadly capable and (ii) assure us that the agent will remain willing to shut down as time passes. This article formally derives a set of sufficient conditions for an agent with incomplete preferences to satisfy properties relevant to (i) and (ii).
A seemingly relevant condition for an agent to be capably goal-directed is that it avoids sequences of actions that foreseeably leave it worse off.[2] I will say that an agent satisfying this condition is invulnerable. This is related to two intuitive conditions. The weaker one is unexploitability: that the agent cannot be forcibly money pumped (i.e., compelled by its preferences to sure loss). The stronger condition is opportunism: that the agent never accepts sure losses or foregoes sure gains.[3]
To achieve this, I propose a dynamic choice rule for agents with incomplete preferences. This rule, Dynamic Strong Maximality (DSM), requires that the agent consider the available plans that are acceptable at the time of choosing and, among these, pick any plan that wasn’t previously strictly dominated by any other such plan. I prove in section 2 that DSM-based backward induction is sufficient for invulnerability, even under uncertainty.
Having shown that incompleteness does not imply that agents will pursue dominated strategies, I consider the issue of whether DSM leads agents to act as if their preferences were complete. Section 3 begins with a conceptual argument suggesting that DSM-based choice under uncertainty will not, even behaviourally, effectively alter the agent’s preferences over time.This argument does not apply when the agent is unaware of the structure of its decision tree, so I provide some formal results for these cases which bound the extent to which preferences can de facto be completed.
These results show that there will always be sets of options with respect to which the agent never completes its preferences. This holds no matter how many choices it faces. In particular, if no new options appear in the decision tree, then no amount of completion will occur; and if new options do appear, the amount of completion is permanently bounded above by the number of mutually incomparable options. These results apply naturally to cases in which agents are unaware of the state space, but readers sceptical of the earlier conceptual argument can re-purpose them to make analogous claims in standard cases of certainty and uncertainty. Therefore, imposing DSM as a choice rule can get us invulnerability without sacrificing incompleteness, even in the limit.
1. Incompleteness and Choice
The aim of this brief section is to show that the results that follow in section 2 do not require transitivity. Some question the requirement of transitivity when preferences are incomplete (cf Bradley 2015, p. 3), but if that doesn’t apply to you, a quick skim of this section will provide enough context for the rest.
1.1. Suzumura Consistency
Requiring that preferences be transitive may require that they be complete. To see this, notice that the weak preference relation ⪰ on a set of prospects X is transitive just in case ∀α,β,γ∈X, α⪰β and β⪰γ implies α⪰γ. Suppose an agent does weakly prefer α to β and β to γ but has no preference between α and γ. Then transitivity is violated. Suzumura (1976) proposes a weakening of transitivity for agents with incomplete preferences, which allows for such preferences while preserving some desirable properties. We will say that a weak preference relation is strongly acyclic just in case
We'll say that an agent whose preferences satisfy this property is Suzumura consistent. Bossert and Suzumura (2010) show that such an agent has some noteworthy features:
Strong acyclicity rules out cycles containing at least one strict preference. This will make Suzumura consistent agents invulnerable to (forcing) money pumps.
Strong acyclicity is necessary and sufficient for the existence of a complete and transitive extension of the agent’s preference relation.
Any preference relation that is both strongly acyclic and complete is transitive.
Preferences that are incomplete may also be intransitive. Whether or not transitivity is a rationality condition, strong acyclicity is weaker but preserves some desirable properties. Below I will mostly just assume strong acyclicity. But since transitivity implies strong acyclicity, all the (sufficiency) results likewise apply to agents with transitive preferences.
1.2. Strong Maximality
Bradley (2015) proposes a choice rule for Suzumura consistent agents which differs from the standard condition—Maximality—for agents with incomplete preferences.[4] This rule—Strong Maximality—effectively asks the agent to eliminate dominated alternatives in the following way: eliminate any options that you strictly disprefer to any others, then if you are indifferent between any remaining options and any eliminated ones, eliminate those as well. To state the rule formally, we define a sequence ⟨¯Cτ(A)⟩∞τ=0 satisfying
¯C0(A)={α∈A:β≻α∃β∈A} and ¯Cτ(A)={α∈A:[β∼α]∧[β∈¯Cτ−1(A)]∃β∈A} whenever τ≥1
for any nonempty set of prospects A. We can then state the rule as
(Strong Maximality)CSM(A)={α∈A:α∉⋃∞τ=0¯Cτ(A)}.
Let’s see intuitively what this rule captures. Suppose α≻β and β∼γ but α⋈/γ.[5] The traditional maximality rule implies that C({α,β,γ})={α,γ}. But strong maximality simply outputs {α}. This is intuitive: don’t pick an option that’s just as bad as an option you dislike. And, more importantly, Theorem 2 of Bradley (2015) shows that Suzumura consistency is both necessary and sufficient for decisive, strongly maximal choice.[6]
2. Uncertain Dynamic Choice
In this section I prove some theorems about the performance of agents with incomplete preferences in dynamic settings. I will amend strong maximality slightly to adapt it to dynamic choice, and show that this is sufficient to guarantee the invulnerability of these agents, broadly construed. I will say that an agent is invulnerable iff it is both unexploitable and opportunistic.
An agent is unexploitable just in case it is immune to all forcing money pumps. These are sequences of decisions through which an agent is compelled by their own preferences to sure loss. An agent is opportunistic just in case it is immune to all non-forcing money pumps. These are situations in which sure loss (or missed gain) is merely permissible, according to an agent’s preferences.
A few aspects of the broad approach are worth flagging. Normative decision theory has not settled on an ideal set of norms to govern dynamic choice. I will therefore provide results with respect to each of the following three dynamic choice principles: naivety, sophistication, and resoluteness. (More details below.) Agents will in general behave differently depending on which principle they follow. So, evaluating the behaviour resulting from an agent’s preferences should, at least initially, be done separately for each dynamic choice principle.
2.1. Framework
The notation going forward comes from an edited version of Hammond's (1988) canonical construction of Bayesian decision trees. I will only describe the framework briefly, so I’ll refer interested readers to Rothfus (2020a) section 1.6. for discussion.
Definition 1 A decision tree is an eight-tuple ⟨N,C,N,X,N+(⋅),n0,S(⋅),γ(⋅)⟩ where
N is a finite set of nodes partitioned into C, N, and X.
C is the set of choice nodes. Here agents can pick the node’s immediate successor.
N is the set of natural nodes. Agents have credences over their possible realisations.
X is the set of terminal nodes. These determine the outcome of a trajectory.
N+:N→P(N) is the immediate successor function.
n0 is the initial node.
S:N→P(S) assigns the set of states that remain possible once a node is reached.
γ(x) is the consequence of reaching terminal node x.
Definition 2 A set of plans available at node n of tree T, denoted Ω(T,n), contains the propositions and continuations consistent with the agent being at that node. Formally,
Ω(T,n)={S(n)} if n∈X.
Ω(T,n)={S(n′)∧π(n′):n′∈N+(n),π(n′)∈Ω(T,n′)} if n∈C.
Ω(T,n)={⋀|N+(n)|j=1[S(nj)→π(nj)]:nj∈N+(n),π(nj)∈Ω(T,nj)} if n∈N.[7]
I will begin the discussion below with dynamic choice under certainty, but to set up the more general results, I will now lay out the framework for uncertain choice as well and specify later what is assumed. To start, it will be useful to have a representation theorem for our incomplete agents.
Imprecise Bayesianism (Bradley 2017, Theorem 37)Let ⪰ be a non-trivial preference relation on a complete and atomless Boolean algebra of prospects Ω=⟨X,⊨⟩, which has a minimal coherent extension on Ω that is both continuous and impartial.[8] Then there exists a unique rationalising state of mind, S={⟨Pi,Vi⟩}, on Ω containing all pairs of probability and desirability measures jointly consistent with these preferences, in the sense that for all α,β∈Ω∖{⊥},
α⪰β⟺Vi(α)≥Vi(β)∀Vi∈S.
This theorem vindicates the claim that the state of mind of a broad class of rational agents with incomplete preferences can be represented by a set of pairs of probability and desirability functions. Although this theorem applies to imprecise credences, I’ll work with the special case of precise beliefs throughout. This will simplify the analysis. I’ll therefore use the following notation going forward: Z={i∈N:Vi∈S}.
Next, I define some conditions that will be invoked in some of the derivations.
Definition 3Material Planning: For a plan to specify choices across various contingencies, I formalise a planning conditional as follows. Let n be a natural node and nj∈N+(n) a possible realisation of it. Here, a plan
p(n)=S(n)∧⋀|N+(n)|j=1[S(nj)→S(zj)]
assigns a chosen continuation, zj∈N+(nj), to each possible realisation of n. When preferences are complete, a plan at a natural node is then evaluated as follows:
Vn(p)=∑|N+(n)|j=1Vn(zj)P(nj|p(n)).
This is a natural extension of Jeffrey’s equation. And when preferences are incomplete:
p⪰np′ iff Vin(p)≥Vin(p′)∀i∈Z.
This makes use of Bradley’s representation theorem for Imprecise Bayesianism.
Definition 4Preference Stability (Incomplete): For all nodes na and nb and plans p where p(na)=p(nb), we have Vina(p)=Vinb(p)∀i∈Z.
Definition 5Plan-Independent Probabilities (PIP): For any decision tree T in which n is a natural node, n′∈N+(n) is a realisation of n, and p∈Ω(T,n), P(n′|p(n))=P(n′|n).
The results can now be stated. I will include proofs of the central claims in the main text; others are relegated to the appendix. I suggest scrolling past them if you aren’t particularly surprised by the result.
2.2. Myopia
Let’s begin with the simplest case of exploitation. We can stay silent on which dynamic choice principle to employ here: even if our agent is myopic, it will never be vulnerable to forcing money pumps under certainty. This follows immediately from Suzumura consistency since this guarantees that the agent never has a strict preference in a cycle of weak preferences.
Proposition 1Suzumura consistent agents myopically applying strong maximality are unexploitable under certainty. [Proof.]
Two points are worth noting about the simple proof. First, it relies on strict preferences. This is because, if a money pump must go through a strongly maximal choice set with multiple elements (due to indifference or appropriate incomparability), it is necessarily non-forcing. That's the topic of the next section. Second, the proof doesn't rely on foresight. Myopia is an intentionally weak assumption that lets us show that no knowledge of the future is required for Suzumura consistent agents to avoid exploitation via these forcing money pumps.
2.3. Naive Choice
Although Suzumura consistent agents using strong maximality can't be forced into money pumps, concern remains. Such an agent might still incidentally do worse by their own lights. That is, it remains vulnerable to non-forcing money pumps and thereby fails to be opportunistic. The single-souring money pump below is a purported example of this.
Figure 1. Adapted from Gustafsson (2022, Figure 9).
The agent’s preferences satisfy A≻A− and B⋈/A,A−. Suppose that it’s myopic and uses strong maximality at each node. The agent begins with A at node 0 (if we let ‘down’ be the default). It is permitted, though not compelled, to go ‘up’ at node 0 instead (since B will become available), but also to go ‘up’ upon arrival at node 1 (since A−⋈/B). Suppose it in fact goes ‘up’ at both node 0 and at node 1. This would leave it with A−, which is strictly worse than what it began with. This money pump is ‘non-forcing’ because the agent’s preferences are also consistent with avoiding it.
The agent need not be myopic, however. It can plan ahead.[9] To achieve opportunism for Suzumura consistent agents, I propose a choice rule which I’ll dub Dynamic Strong Maximality (DSM). DSM states that a plan p is permissible at node n just in case (a) it is strongly maximal at n and (b) no other such plan was previously more choiceworthy than p.
(Dynamic Strong Maximality)p∈D(Ω(T,n)) iff
(a) p∈CSM(Ω(T,n)) and
(b) ∄p∗∈CSM(Ω(T,n)):p∗∈CSM(Ω(T,n0))∌p .
DSM is a slight refinement of strong maximality. Condition (b) simply offers a partial tie-breaking rule whenever the agent faces multiple choiceworthy prospects. So, importantly, it never asks the agent to pick counter-preferentially. An agent following naive choice with DSM will, at each node, look ahead in the decision tree, select their favourite trajectory using DSM, and embark on it. It can continually re-evaluate its plans using naive-DSM as time progresses and, as the following result establishes, the agent will thereby never end up with something worse than what it began with.
Proposition 2 (Strong Dynamic Consistency Under Certainty via Naivety)Let T be an arbitrary tree where n is a choice node, n′∈N+(n), and p∈Ω(T,n) is consistent with S(n′). Then p(n)∈D(Ω(T,n)) iff p(n′)∈D(Ω(T,n′)). [Proof.]
Intuitively, this means that (i) if the agent now considers a trajectory acceptable, it will continue to do so as time passes, and (ii) if it at any future point considers some plan continuation acceptable, its past self would agree. It follows immediately that all and only the strongly maximal terminal nodes are reachable by agents choosing naively using DSM (derived as Corollary 1). This gives us opportunism: the agent will never pick a plan that’s strictly dominated some other available plan.
Under certainty, this result is unsurprising.[10] What is less obvious is whether this also holds under uncertainty. I will say that a Bayesian decision tree exhibits PIP-uncertainty just in case the probability of any given event does not depend on what the agent plans to do after the event has occurred. We can now state the next result.
Proposition 3 (Strong Dynamic Consistency Under PIP-Uncertainty via Naivety) Let node n be non-terminal in decision tree T, n′∈N+(n), and plan p∈Ω(T,n) be consistent with S(n′). Assume Material Planning, Preference Stability, and Plan-Independent Probabilities. Then p∈D(Ω(T,n)) iff p∈D(Ω(T,n′)).
Proof. Lemma 3 establishes that p∈D(Ω(T,n)) implies p∈D(Ω(T,n′)). To prove the converse, suppose that p∈D(Ω(T,n′)). Node n was either a choice node or a natural node. If it was a choice node, then it follows immediately from Proposition 2 that p∈D(Ω(T,n)).
Now let n be a natural node. By Lemma 2, COpt(⋅)=CSM(⋅)=D(⋅) under coherent extension. So by Theorem 37 of Bradley (2017),
Vin′(p(n′))≥Vin′(p′(n′))∃i∈Z for all p′∈Ω(T,n′).
Let z′[q] denote the continuation selected by plan q upon reaching choice node n′. Thus
Vin′(z′[p])≥Vin′(z′[p′])∃i∈Z for all p′∈Ω(T,n′).
By Preference Stability,
Vin(z′[p])≥Vin(z′[p′])∃i∈Z for all p′∈Ω(T,n′). (1)
Notice that this implies that
Vin(z′[p])P(n′|n)≥Vin(z′[p′])P(n′|n)∃i∈Z for all p′∈Ω(T,n′). (2)
And by Plan-Independent Probabilities, this is equivalent to
Vin(z′[p])P(n′|p(n))≥Vin(z′[p′])P(n′|p′(n))∃i∈Z for all p′∈Ω(T,n′). (3)
Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort. My thanks to Eric Chen, Elliott Thornley, and John Wentworth for invaluable discussion and comments on earlier drafts. All errors are mine.
This article presents a few theorems about the invulnerability of agents with incomplete preferences. Elliott Thornley’s (2023) proposed approach to the AI shutdown problem relies on these preferential gaps, but John Wentworth and David Lorell have argued that they make agents play strictly dominated strategies.[1] I claim this is false.
Summary
Suppose there exists a formal description of an agent that willingly shuts down when a certain button is pressed. Elliott Thornley’s (2023) Incomplete Preference Proposal aims to offer such a description. It’s plausible that, for it to constitute a promising approach to solving the AI shutdown problem, this description also needs to (i) permit the agent to be broadly capable and (ii) assure us that the agent will remain willing to shut down as time passes. This article formally derives a set of sufficient conditions for an agent with incomplete preferences to satisfy properties relevant to (i) and (ii).
A seemingly relevant condition for an agent to be capably goal-directed is that it avoids sequences of actions that foreseeably leave it worse off.[2] I will say that an agent satisfying this condition is invulnerable. This is related to two intuitive conditions. The weaker one is unexploitability: that the agent cannot be forcibly money pumped (i.e., compelled by its preferences to sure loss). The stronger condition is opportunism: that the agent never accepts sure losses or foregoes sure gains.[3]
To achieve this, I propose a dynamic choice rule for agents with incomplete preferences. This rule, Dynamic Strong Maximality (DSM), requires that the agent consider the available plans that are acceptable at the time of choosing and, among these, pick any plan that wasn’t previously strictly dominated by any other such plan. I prove in section 2 that DSM-based backward induction is sufficient for invulnerability, even under uncertainty.
Having shown that incompleteness does not imply that agents will pursue dominated strategies, I consider the issue of whether DSM leads agents to act as if their preferences were complete. Section 3 begins with a conceptual argument suggesting that DSM-based choice under uncertainty will not, even behaviourally, effectively alter the agent’s preferences over time. This argument does not apply when the agent is unaware of the structure of its decision tree, so I provide some formal results for these cases which bound the extent to which preferences can de facto be completed.
These results show that there will always be sets of options with respect to which the agent never completes its preferences. This holds no matter how many choices it faces. In particular, if no new options appear in the decision tree, then no amount of completion will occur; and if new options do appear, the amount of completion is permanently bounded above by the number of mutually incomparable options. These results apply naturally to cases in which agents are unaware of the state space, but readers sceptical of the earlier conceptual argument can re-purpose them to make analogous claims in standard cases of certainty and uncertainty. Therefore, imposing DSM as a choice rule can get us invulnerability without sacrificing incompleteness, even in the limit.
1. Incompleteness and Choice
The aim of this brief section is to show that the results that follow in section 2 do not require transitivity. Some question the requirement of transitivity when preferences are incomplete (cf Bradley 2015, p. 3), but if that doesn’t apply to you, a quick skim of this section will provide enough context for the rest.
1.1. Suzumura Consistency
Requiring that preferences be transitive may require that they be complete. To see this, notice that the weak preference relation ⪰ on a set of prospects X is transitive just in case ∀α,β,γ∈X, α⪰β and β⪰γ implies α⪰γ. Suppose an agent does weakly prefer α to β and β to γ but has no preference between α and γ. Then transitivity is violated. Suzumura (1976) proposes a weakening of transitivity for agents with incomplete preferences, which allows for such preferences while preserving some desirable properties. We will say that a weak preference relation is strongly acyclic just in case
(Strong Acyclicity) ∀α1,...,αn∈X, α1⪰...⪰αn implies αn⊁α1.
We'll say that an agent whose preferences satisfy this property is Suzumura consistent. Bossert and Suzumura (2010) show that such an agent has some noteworthy features:
Preferences that are incomplete may also be intransitive. Whether or not transitivity is a rationality condition, strong acyclicity is weaker but preserves some desirable properties. Below I will mostly just assume strong acyclicity. But since transitivity implies strong acyclicity, all the (sufficiency) results likewise apply to agents with transitive preferences.
1.2. Strong Maximality
Bradley (2015) proposes a choice rule for Suzumura consistent agents which differs from the standard condition—Maximality—for agents with incomplete preferences.[4] This rule—Strong Maximality—effectively asks the agent to eliminate dominated alternatives in the following way: eliminate any options that you strictly disprefer to any others, then if you are indifferent between any remaining options and any eliminated ones, eliminate those as well. To state the rule formally, we define a sequence ⟨¯Cτ(A)⟩∞τ=0 satisfying
¯C0(A)={α∈A:β≻α∃β∈A} and ¯Cτ(A)={α∈A:[β∼α]∧[β∈¯Cτ−1(A)]∃β∈A} whenever τ≥1
for any nonempty set of prospects A. We can then state the rule as
(Strong Maximality) CSM(A)={α∈A:α∉⋃∞τ=0¯Cτ(A)}.
Let’s see intuitively what this rule captures. Suppose α≻β and β∼γ but α⋈/γ.[5] The traditional maximality rule implies that C({α,β,γ})={α,γ}. But strong maximality simply outputs {α}. This is intuitive: don’t pick an option that’s just as bad as an option you dislike. And, more importantly, Theorem 2 of Bradley (2015) shows that Suzumura consistency is both necessary and sufficient for decisive, strongly maximal choice.[6]
2. Uncertain Dynamic Choice
In this section I prove some theorems about the performance of agents with incomplete preferences in dynamic settings. I will amend strong maximality slightly to adapt it to dynamic choice, and show that this is sufficient to guarantee the invulnerability of these agents, broadly construed. I will say that an agent is invulnerable iff it is both unexploitable and opportunistic.
An agent is unexploitable just in case it is immune to all forcing money pumps. These are sequences of decisions through which an agent is compelled by their own preferences to sure loss. An agent is opportunistic just in case it is immune to all non-forcing money pumps. These are situations in which sure loss (or missed gain) is merely permissible, according to an agent’s preferences.
A few aspects of the broad approach are worth flagging. Normative decision theory has not settled on an ideal set of norms to govern dynamic choice. I will therefore provide results with respect to each of the following three dynamic choice principles: naivety, sophistication, and resoluteness. (More details below.) Agents will in general behave differently depending on which principle they follow. So, evaluating the behaviour resulting from an agent’s preferences should, at least initially, be done separately for each dynamic choice principle.
2.1. Framework
The notation going forward comes from an edited version of Hammond's (1988) canonical construction of Bayesian decision trees. I will only describe the framework briefly, so I’ll refer interested readers to Rothfus (2020a) section 1.6. for discussion.
Definition 1 A decision tree is an eight-tuple ⟨N,C,N,X,N+(⋅),n0,S(⋅),γ(⋅)⟩ where
Definition 2 A set of plans available at node n of tree T, denoted Ω(T,n), contains the propositions and continuations consistent with the agent being at that node. Formally,
I will begin the discussion below with dynamic choice under certainty, but to set up the more general results, I will now lay out the framework for uncertain choice as well and specify later what is assumed. To start, it will be useful to have a representation theorem for our incomplete agents.
Imprecise Bayesianism (Bradley 2017, Theorem 37) Let ⪰ be a non-trivial preference relation on a complete and atomless Boolean algebra of prospects Ω=⟨X,⊨⟩, which has a minimal coherent extension on Ω that is both continuous and impartial.[8] Then there exists a unique rationalising state of mind, S={⟨Pi,Vi⟩}, on Ω containing all pairs of probability and desirability measures jointly consistent with these preferences, in the sense that for all α,β∈Ω∖{⊥},
α⪰β⟺Vi(α)≥Vi(β)∀Vi∈S.
This theorem vindicates the claim that the state of mind of a broad class of rational agents with incomplete preferences can be represented by a set of pairs of probability and desirability functions. Although this theorem applies to imprecise credences, I’ll work with the special case of precise beliefs throughout. This will simplify the analysis. I’ll therefore use the following notation going forward: Z={i∈N:Vi∈S}.
Next, I define some conditions that will be invoked in some of the derivations.
Definition 3 Material Planning: For a plan to specify choices across various contingencies, I formalise a planning conditional as follows. Let n be a natural node and nj∈N+(n) a possible realisation of it. Here, a plan
p(n)=S(n)∧⋀|N+(n)|j=1[S(nj)→S(zj)]
assigns a chosen continuation, zj∈N+(nj), to each possible realisation of n. When preferences are complete, a plan at a natural node is then evaluated as follows:
Vn(p)=∑|N+(n)|j=1Vn(zj)P(nj|p(n)).
This is a natural extension of Jeffrey’s equation. And when preferences are incomplete:
p⪰np′ iff Vin(p)≥Vin(p′)∀i∈Z.
This makes use of Bradley’s representation theorem for Imprecise Bayesianism.
Definition 4 Preference Stability (Incomplete): For all nodes na and nb and plans p where p(na)=p(nb), we have Vina(p)=Vinb(p)∀i∈Z.
Definition 5 Plan-Independent Probabilities (PIP): For any decision tree T in which n is a natural node, n′∈N+(n) is a realisation of n, and p∈Ω(T,n), P(n′|p(n))=P(n′|n).
The results can now be stated. I will include proofs of the central claims in the main text; others are relegated to the appendix. I suggest scrolling past them if you aren’t particularly surprised by the result.
2.2. Myopia
Let’s begin with the simplest case of exploitation. We can stay silent on which dynamic choice principle to employ here: even if our agent is myopic, it will never be vulnerable to forcing money pumps under certainty. This follows immediately from Suzumura consistency since this guarantees that the agent never has a strict preference in a cycle of weak preferences.
Proposition 1 Suzumura consistent agents myopically applying strong maximality are unexploitable under certainty. [Proof.]
Two points are worth noting about the simple proof. First, it relies on strict preferences. This is because, if a money pump must go through a strongly maximal choice set with multiple elements (due to indifference or appropriate incomparability), it is necessarily non-forcing. That's the topic of the next section. Second, the proof doesn't rely on foresight. Myopia is an intentionally weak assumption that lets us show that no knowledge of the future is required for Suzumura consistent agents to avoid exploitation via these forcing money pumps.
2.3. Naive Choice
Although Suzumura consistent agents using strong maximality can't be forced into money pumps, concern remains. Such an agent might still incidentally do worse by their own lights. That is, it remains vulnerable to non-forcing money pumps and thereby fails to be opportunistic. The single-souring money pump below is a purported example of this.
The agent’s preferences satisfy A≻A− and B⋈/A,A−. Suppose that it’s myopic and uses strong maximality at each node. The agent begins with A at node 0 (if we let ‘down’ be the default). It is permitted, though not compelled, to go ‘up’ at node 0 instead (since B will become available), but also to go ‘up’ upon arrival at node 1 (since A−⋈/B). Suppose it in fact goes ‘up’ at both node 0 and at node 1. This would leave it with A−, which is strictly worse than what it began with. This money pump is ‘non-forcing’ because the agent’s preferences are also consistent with avoiding it.
The agent need not be myopic, however. It can plan ahead.[9] To achieve opportunism for Suzumura consistent agents, I propose a choice rule which I’ll dub Dynamic Strong Maximality (DSM). DSM states that a plan p is permissible at node n just in case (a) it is strongly maximal at n and (b) no other such plan was previously more choiceworthy than p.
(Dynamic Strong Maximality) p∈D(Ω(T,n)) iff
(a) p∈CSM(Ω(T,n)) and
(b) ∄p∗∈CSM(Ω(T,n)):p∗∈CSM(Ω(T,n0))∌p .
DSM is a slight refinement of strong maximality. Condition (b) simply offers a partial tie-breaking rule whenever the agent faces multiple choiceworthy prospects. So, importantly, it never asks the agent to pick counter-preferentially. An agent following naive choice with DSM will, at each node, look ahead in the decision tree, select their favourite trajectory using DSM, and embark on it. It can continually re-evaluate its plans using naive-DSM as time progresses and, as the following result establishes, the agent will thereby never end up with something worse than what it began with.
Proposition 2 (Strong Dynamic Consistency Under Certainty via Naivety) Let T be an arbitrary tree where n is a choice node, n′∈N+(n), and p∈Ω(T,n) is consistent with S(n′). Then p(n)∈D(Ω(T,n)) iff p(n′)∈D(Ω(T,n′)). [Proof.]
Intuitively, this means that (i) if the agent now considers a trajectory acceptable, it will continue to do so as time passes, and (ii) if it at any future point considers some plan continuation acceptable, its past self would agree. It follows immediately that all and only the strongly maximal terminal nodes are reachable by agents choosing naively using DSM (derived as Corollary 1). This gives us opportunism: the agent will never pick a plan that’s strictly dominated some other available plan.
Under certainty, this result is unsurprising.[10] What is less obvious is whether this also holds under uncertainty. I will say that a Bayesian decision tree exhibits PIP-uncertainty just in case the probability of any given event does not depend on what the agent plans to do after the event has occurred. We can now state the next result.
Proposition 3 (Strong Dynamic Consistency Under PIP-Uncertainty via Naivety)
Let node n be non-terminal in decision tree T, n′∈N+(n), and plan p∈Ω(T,n) be consistent with S(n′). Assume Material Planning, Preference Stability, and Plan-Independent Probabilities. Then p∈D(Ω(T,n)) iff p∈D(Ω(T,n′)).
Proof. Lemma 3 establishes that p∈D(Ω(T,n)) implies p∈D(Ω(T,n′)). To prove the converse, suppose that p∈D(Ω(T,n′)). Node n was either a choice node or a natural node. If it was a choice node, then it follows immediately from Proposition 2 that p∈D(Ω(T,n)).
Now let n be a natural node. By Lemma 2, COpt(⋅)=CSM(⋅)=D(⋅) under coherent extension. So by Theorem 37 of Bradley (2017),
Vin′(p(n′))≥Vin′(p′(n′))∃i∈Z for all p′∈Ω(T,n′).
Let z′[q] denote the continuation selected by plan q upon reaching choice node n′. Thus
Vin′(z′[p])≥Vin′(z′[p′])∃i∈Z for all p′∈Ω(T,n′).
By Preference Stability,
Vin(z′[p])≥Vin(z′[p′])∃i∈Z for all p′∈Ω(T,n′). (1)
Notice that this implies that
Vin(z′[p])P(n′|n)≥Vin(z′[p′])P(n′|n)∃i∈Z for all p′∈Ω(T,n′). (2)
And by Plan-Independent Probabilities, this is equivalent to
Vin(z′[p])P(n′|p(n))≥Vin(z′[p′])P(n′|p′(n))∃i∈Z for all p′∈Ω(T,n′). (3)
By Material Planning, p∈CSM(Ω(T,n)) holds iff
∑jVin(zj