This is the twelfth and final post in the Cartesian Frames sequence. Read the first post here.
Up until now, we have (in the examples) mostly considered agents making a single choice, rather than acting repeatedly over time.
The actions, environments, and worlds we've considered might be extended over time. For example, imagine a prisoner's dilemma where "cooperating" requires pushing a button every day for five years.
However, our way of discussing Cartesian frames so far would treat "push the button every day for five years" as an atomic action, a single element a∈A.
Now, will begin discussing how to use Cartesian frames to explicitly represent agents passing through time. Let us start with a basic example.
1. Partial Observability
Consider a process where two players, Yosef and Zoe, collaboratively choose a three-digit binary number. Yosef first chooses the first digit, then Zoe chooses the second digit, then Yosef chooses the third digit. The world will be represented by the three-digit number. The Cartesian frame from the perspective of Yosef looks like this:
Here, C0=(A0,E0,⋅0) is a Cartesian frame over W0={000,001,010,011,100,101,110,111}.
The four possible environments from left to right represent Zoe choosing 0, Zoe choosing 1, Zoe copying the first digit, and Zoe negating the first digit.
The eight possible agents can be broken up into two groups of four. In the top four possible agents, Yosef chooses 0 for the first digit, while in the bottom four, he chooses 1. Within each group, the four possible agents represent Yosef choosing 0 for the third digit, choosing 1 for the third digit, copying the second digit, and negating the second digit.
Consider the three partitions W1, W2, and W3 of W0 representing the first, second and third digits respectively. Wi={w0i,w1i}, where w01={000,001,010,011}, w11={100,101,110,111}, w02={000,001,100,101}, w12={010,011,110,111}, w03={000,010,100,110}, and w13={001,011,101,111}.
Clearly, by the definition of observables, W2 is not observable in C0. But there is still a sense in which this does not tell the whole story. Yosef can observe W2 for the purpose of deciding the third digit, but can't observe W2 for the purpose of deciding the first digit.
There are actually many ways to express this fact, but I want to draw attention to one specific way to express this partial observability: ExternalW1(C0) can observe W2.
It may seem counter-intuitive that when you externalizeW1, and thus take some control out of the hands of the agent, you actually end up with more possible agents. This is because the agent now has to specify what the third digit is, not only as a function of the second digit, but also as a function of the first digit. The agent could have specified the third digit as a function of the first digit before, but some of the policies would have been identical to each other.
The four possible environments of C1 specify the first two digits, while the 16 possible agents represent all of the ways to have the third digit be a function of those first two digits. It is clear that W2 is observable in C1.
This gives us a generic way to define a type of partial observability:
Definition: Given a Cartesian frame C over W, and partitions V and T of W, we say <
This is the twelfth and final post in the Cartesian Frames sequence. Read the first post here.
Up until now, we have (in the examples) mostly considered agents making a single choice, rather than acting repeatedly over time.
The actions, environments, and worlds we've considered might be extended over time. For example, imagine a prisoner's dilemma where "cooperating" requires pushing a button every day for five years.
However, our way of discussing Cartesian frames so far would treat "push the button every day for five years" as an atomic action, a single element a∈A.
Now, will begin discussing how to use Cartesian frames to explicitly represent agents passing through time. Let us start with a basic example.
1. Partial Observability
Consider a process where two players, Yosef and Zoe, collaboratively choose a three-digit binary number. Yosef first chooses the first digit, then Zoe chooses the second digit, then Yosef chooses the third digit. The world will be represented by the three-digit number. The Cartesian frame from the perspective of Yosef looks like this:
C0=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝000010000010001011001011000011000011001010001010100110110100101111111101100111111100101110110101⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.
Here, C0=(A0,E0,⋅0) is a Cartesian frame over W0={000, 001, 010, 011, 100, 101, 110, 111}.
The four possible environments from left to right represent Zoe choosing 0, Zoe choosing 1, Zoe copying the first digit, and Zoe negating the first digit.
The eight possible agents can be broken up into two groups of four. In the top four possible agents, Yosef chooses 0 for the first digit, while in the bottom four, he chooses 1. Within each group, the four possible agents represent Yosef choosing 0 for the third digit, choosing 1 for the third digit, copying the second digit, and negating the second digit.
Consider the three partitions W1, W2, and W3 of W0 representing the first, second and third digits respectively. Wi={w0i,w1i}, where w01={000, 001, 010, 011}, w11={100, 101, 110, 111}, w02={000, 001, 100, 101}, w12={010, 011, 110, 111}, w03={000, 010, 100, 110}, and w13={001, 011, 101, 111}.
Clearly, by the definition of observables, W2 is not observable in C0. But there is still a sense in which this does not tell the whole story. Yosef can observe W2 for the purpose of deciding the third digit, but can't observe W2 for the purpose of deciding the first digit.
There are actually many ways to express this fact, but I want to draw attention to one specific way to express this partial observability: ExternalW1(C0) can observe W2.
Indeed, we have
ExternalW1(C0)≃C1=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝000010100110000010100111000010101110000010101111000011100110000011100111000011101110000011101111001010100110001010100111001010101110001010101111001011100110001011100111001011101110001011101111⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.
It may seem counter-intuitive that when you externalize W1, and thus take some control out of the hands of the agent, you actually end up with more possible agents. This is because the agent now has to specify what the third digit is, not only as a function of the second digit, but also as a function of the first digit. The agent could have specified the third digit as a function of the first digit before, but some of the policies would have been identical to each other.
The four possible environments of C1 specify the first two digits, while the 16 possible agents represent all of the ways to have the third digit be a function of those first two digits. It is clear that W2 is observable in C1.
This gives us a generic way to define a type of partial observability:
Definition: Given a Cartesian frame C over W, and partitions V and T of W, we say <