This is the fifth post in the Cartesian frames sequence. Read the first post here.
Up until this point, we have only been working with Cartesian frames over a fixed world W. Now, we are going to start talking about Cartesian frames over different worlds.
1. Functors from Functions Between Worlds
In the Cartesian frames framework, a world is a set of possible worlds w that can all potentially occur in the same frame.
I find it useful to think about "different worlds" W and V in the case where W and V are different world models that carve up a situation in two different ways. W might be a refined world model, one that describes a situation in more detail; while V is a coarser model of the same situation that elides some distinctions in W.
Returning to an example from "Biextensional Equivalence," W={w0,w1,w2,w3,w4,w5,w6,w7} could be a world model that includes details about what the agent is thinking (G for a thought about the color green, R for red), as shown in
C0=SBGHGWRHRW⎛⎜
⎜
⎜⎝w0w1w2w3w4w5w6w7⎞⎟
⎟
⎟⎠,
while V={w8,w9,w10,w11} could be a world model that leaves out this information, representing the same real-world situation with the frame
C1=SBGHGWRHRW⎛⎜
⎜
⎜⎝w8w9w10w11w8w9w10w11⎞⎟
⎟
⎟⎠.
To move between frames like C0 and C1 and compare their properties, we will need a way to send agents and environments of frames defined over one world, to agents and environments of frames over an entirely different world. Functors will allow us to do this.
Definition: Given two sets W and and V, and a function p:W→V, let p∘:Chu(W)→Chu(V) denote the functor that sends the object (A,E,⋅)∈Chu(W) to the object (A,E,⋆)∈Chu(V), where a⋆e=p(a⋅e), and sends the morphism(g,h) to the morphism with the same underlying functions, (g,h).
To visualize this functor, you can imagine Chu(W) as a graph, with matrices as nodes (in the finite case) and arrows representing morphisms. Chu(V) is another graph made of matrices and arrows. To move each frame C from Chu(W) to Chu(V), we use p to entrywise replace the possible worlds in C's matrix with elements of V, without changing the functional properties of the rows and columns; and then we move all the arrows from Chu(W) to Chu(V), which is possible because no functional properties of the original matrices were lost. (Frames and morphisms may or may not be added when we move to Chu(V).)
In the cases where we say "W is a refined version of V" or "V is a coarse version of W," all we mean is that the function p:W→V is surjective.
Claim:p∘ is well-defined.
Proof: We need to show that p∘ actually sends objects and morphisms of Chu(W) to objects and morphisms of Chu(V), and that it preserves identity morphisms and composition. p∘ clearly sends objects to objects. To see that p∘ sends morphisms to morphisms, observe that if (g,h):(A0,E0,⋅0)→(A1,E1,⋅1), and p∘(Ai,Ei,⋅i)=(Ai,Ei,⋆i), then for all a∈A0 and e∈E1,
g(a)⋆1e=p(g(a)⋅1e)=p(a⋅0h(e))=a⋆0h(e),
so p∘(g,h)=(g,h) is a morphism. It is clear that p∘ preserves identity and composition, since it has no effect on morphisms. □
We also have that p∘ preserves all of our additive operations.
Claim:p∘(C⊕D)=p∘(C)⊕p∘(D), p∘(C&D)=p∘(C)&p∘(D), p∘(C∗)=p∘(C)∗, p∘(0)=0, p∘(⊤)=⊤, and p∘(null)=null.
Proof: Trivial. □
Our new functor's relationship with 1 and ⊥ is more interesting. In particular, we can define 1S and ⊥S from 1 and ⊥ using functors.
Claim: Let S⊆W and let ι:S→W be the inclusion of S in W. Then 1S=ι∘(1) and ⊥S=ι∘(⊥). (Here, the 1 and ⊥ are from Chu(S), not Chu(W).)
Proof: Trivial. □
This gives us a more categorical definition of 1S and ⊥S from 1 and ⊥. We will give a more categorical definition of 1 and ⊥ later, when we talk about multiplicative operations.
p∘ also preserves biextensional equivalence in one direction. (Two equivalent frames in W will always be equivalent in V, but two inequivalent frames in W won't necessarily be inequivalent in V.)
Claim: If C≃D, then p∘(C)≃p∘(D).
Proof: Let C=(A,E,⋅) and let D=(B,F,⋆). Let (g0,h0):C→D and (g1,h1):D→C compose to something homotopic to the identity in both orders. We want to show that (g0,h0):p∘(C)→p∘(D) and (g1,h1):p∘(D)→p∘(C) compose to something homotopic to the identity in both orders. Indeed p(g1(g0(a))⋅e)=p(a⋅e) for all a∈A and e∈E, and p(g0(g1(b))⋆f)=p(b⋆f) for all b∈B and f∈F. □
We also have that p∘ preserves what's ensurable, where we transition from subsets of W to subsets of V in the obvious way.
Claim: Let p:W→V, and let p(S)={v∈V|∃w∈S,p(w)=v}. If S∈Ensure(C), then p(S)∈Ensure(p∘(C)).
We also get a stronger result when dealing with subsets of W and V that correspond exactly.
Claim: Let p:W→V, and let S⊆W and T⊆V be such that for all w∈W, we have p(w)∈T if and only if w∈S. Then S∈Ensure(C) if and only if T∈Ensure(p∘(C)), and S∈Ctrl(C) if and only if T∈Ctrl(p∘(C)).
Proof: Trivial from the original definitions of ensurables and controllables. □
The relationship between observability and functors is quite interesting. We will devote the next section to discussing this relationship and its philosophical consequences.
2. What's Observable is Relative to a Coarse World Model
Since observability is not closed under supersets, we can only really hope to get a result for observables in the stronger case where S⊆W and T⊆V correspond exactly; but interestingly, even then, the preservation result for observables is only one-directional.
Claim: Let p:W→V and let S⊆W and T⊆V be such that for all w∈W, we have p(w)∈T if and only if w∈S. Then if S∈Obs(C), then T∈Obs(p∘(C)).
Proof: If C≃C0&C1, with Image(C0)⊆S and Image(C1)⊆W∖S, then p∘(C)≃p∘(C0)&p∘(C1), and Image(p∘(C0))=p(Image(C0))⊆p(S)⊆T, while Image(p∘(C1))=p(Image(C1))⊆p(W∖S)⊆V∖T. □
The most interesting thing here is that the converse is not also true. There are examples where T∈Obs(p∘(C)), even though S∉Obs(C).
When p is surjective, we think of V as a coarse world model that forgets some details from W. Sometimes, an agent can be able to observe S relative to a coarse description of the world, but not in the more refined description, even in cases where S is definable in both the coarse and refined descriptions.
2.1. Example
Let us look at an example. In this example, the agent is an AI that will be given a number and asked whether it is prime or not. There are two possible environments E={Prime,Nonprime}.
The agent A has six strategies:
It can compute whether the number is prime and answer correctly (AccurateHot), or it can compute whether the number is prime and answer incorrectly (InaccurateHot). In both of these cases, it has to use a lot of energy and will become hotter.
It can also just not think and say that the number is prime (PrimeCool), or it can not think and say that the number is not prime (NonprimeCool).
Finally, it can deliberately waste a lot of energy for no reason and say the number is prime or waste a lot of energy and say the number is not prime (PrimeHot and NonprimeHot respectively).
Finally, W={PAH,PAC,PIH,PIC,NAH,NAC,NIH,NIC}, where the first letter indicates whether the AI was given a prime or nonprime number, the second letter indicates whether the AI's answer was accurate or inaccurate, and the third letter indicates whether the AI is hot. The Cartesian frame, C, looks like this.
We will let V be the coarse description of the world in which we only pay attention to the input/output behavior of the AI and ignore whether or not it becomes hot. V={PA,PI,NA,NI}, and we will let p:W→V be the function that deletes the third letter. This gives us the following for p∘(C).
The important thing to notice here is that {PA,PI}∈Obs(p∘(C))—when we ignore heat, the agent can base conditional strategies on whether the number is prime—but {PAH,PAC,PIH,PIC}∉Obs(C).
In particular, p∘(C)≃C0&C1, where
C0=PrimeAccurateInaccurate(PAPI) and C1=NonprimeAccurateInaccurate(NANI),
while it is easy to see that {PAH,PAC,PIH,PIC}∉Obs(C), because there is no a∈if({PAH,PAC,PIH,PIC},PrimeCool,NonprimeCool).
2.2. Discussion
The above example illustrates something interesting about observables. It shows that what's observable is not only a function of the observing agent and the thing that is observed. It is also a function of the level of description of the world!
This makes sense because we are thinking of observation as the ability to implement conditional policies. To implement a conditional policy is to be indistinguishable from the constant policy a0 in worlds in S and indistinguishable from the constant policy a1 in worlds outside of S. This indistinguishability makes observables relative to the level of description of the world.
There is something internal to the agent that is different between the world where it implements a conditional policy and the world where it implements a constant policy. However, when we talk of S being an observable for the agent, we are working relative to a level of description that does not track that internal difference.
3. Functors from Cartesian Frames
When p:W→V is surjective, p∘ will send Cartesian frames over the more refined W to Cartesian frames over the less refined V. What if we want to go in the other direction?
While there is a unique function from less refined worlds to more refined worlds, there are many functions in the other direction. Luckily, we have an object that lets us deal with many functions at once.
Definition: Let C=(V,E,⋅) be a Cartesian frame over W, with Agent(C)=V. Then C∘:Chu(V)→Chu(W) is the functor that sends (B,F,⋆) to (B,F×E,⋄), where b⋄(f,e)=(b⋆f)⋅e, and sends the morphism (g,h) to (g,h′), where h′(f,e)=(h(f),e).
(Notice how this definition looks a bit like currying.)
Claim:C∘ is well-defined.
Proof: We need to show that C∘ actually sends objects and morphisms of Chu(V) to objects and morphisms of Chu(W), and that it preserves identity morphisms and composition.
C∘ clearly sends objects to objects. To see that it sends morphisms to morphisms, let (g,h):(B0,F0,⋆0)→(B1,F1,⋆1) be a morphism in Chu(V), let (Bi,Fi×E,⋄i)=C∘(Bi,Fi,⋆i), and let (g,h′)=C∘(g,h).
We want to show that (g,h′):(B0,F0×E,⋄0)→(B1,F1×E,⋄1) is a morphism, which is true because
for all b∈B0 and (f,e)∈F1×E. C∘ clearly preserves identity morphisms and composition. □
The coarse-to-refined functor C∘ preserves &, ⊤, and null, but not ⊕, 0, or −∗, which make sense, since C∘ is violating the symmetry between agent and environment.
This is the fifth post in the Cartesian frames sequence. Read the first post here.
Up until this point, we have only been working with Cartesian frames over a fixed world W. Now, we are going to start talking about Cartesian frames over different worlds.
1. Functors from Functions Between Worlds
In the Cartesian frames framework, a world is a set of possible worlds w that can all potentially occur in the same frame.
I find it useful to think about "different worlds" W and V in the case where W and V are different world models that carve up a situation in two different ways. W might be a refined world model, one that describes a situation in more detail; while V is a coarser model of the same situation that elides some distinctions in W.
Returning to an example from "Biextensional Equivalence," W={w0,w1,w2,w3,w4,w5,w6,w7} could be a world model that includes details about what the agent is thinking (G for a thought about the color green, R for red), as shown in
C0=SBGHGWRHRW⎛⎜ ⎜ ⎜⎝w0w1w2w3w4w5w6w7⎞⎟ ⎟ ⎟⎠,
while V={w8,w9,w10,w11} could be a world model that leaves out this information, representing the same real-world situation with the frame
C1=SBGHGWRHRW⎛⎜ ⎜ ⎜⎝w8w9w10w11w8w9w10w11⎞⎟ ⎟ ⎟⎠.
To move between frames like C0 and C1 and compare their properties, we will need a way to send agents and environments of frames defined over one world, to agents and environments of frames over an entirely different world. Functors will allow us to do this.
Definition: Given two sets W and and V, and a function p:W→V, let p∘:Chu(W)→Chu(V) denote the functor that sends the object (A,E,⋅)∈Chu(W) to the object (A,E,⋆)∈Chu(V), where a⋆e=p(a⋅e), and sends the morphism (g,h) to the morphism with the same underlying functions, (g,h).
To visualize this functor, you can imagine Chu(W) as a graph, with matrices as nodes (in the finite case) and arrows representing morphisms. Chu(V) is another graph made of matrices and arrows. To move each frame C from Chu(W) to Chu(V), we use p to entrywise replace the possible worlds in C's matrix with elements of V, without changing the functional properties of the rows and columns; and then we move all the arrows from Chu(W) to Chu(V), which is possible because no functional properties of the original matrices were lost. (Frames and morphisms may or may not be added when we move to Chu(V).)
In the cases where we say "W is a refined version of V" or "V is a coarse version of W," all we mean is that the function p:W→V is surjective.
Claim: p∘ is well-defined.
Proof: We need to show that p∘ actually sends objects and morphisms of Chu(W) to objects and morphisms of Chu(V), and that it preserves identity morphisms and composition. p∘ clearly sends objects to objects. To see that p∘ sends morphisms to morphisms, observe that if (g,h):(A0,E0,⋅0)→(A1,E1,⋅1), and p∘(Ai,Ei,⋅i)=(Ai,Ei,⋆i), then for all a∈A0 and e∈E1,
g(a)⋆1e=p(g(a)⋅1e)=p(a⋅0h(e))=a⋆0h(e),so p∘(g,h)=(g,h) is a morphism. It is clear that p∘ preserves identity and composition, since it has no effect on morphisms. □
We also have that p∘ preserves all of our additive operations.
Claim: p∘(C⊕D)=p∘(C)⊕p∘(D), p∘(C&D)=p∘(C)&p∘(D), p∘(C∗)=p∘(C)∗, p∘(0)=0, p∘(⊤)=⊤, and p∘(null)=null.
Proof: Trivial. □
Our new functor's relationship with 1 and ⊥ is more interesting. In particular, we can define 1S and ⊥S from 1 and ⊥ using functors.
Claim: Let S⊆W and let ι:S→W be the inclusion of S in W. Then 1S=ι∘(1) and ⊥S=ι∘(⊥). (Here, the 1 and ⊥ are from Chu(S), not Chu(W).)
Proof: Trivial. □
This gives us a more categorical definition of 1S and ⊥S from 1 and ⊥. We will give a more categorical definition of 1 and ⊥ later, when we talk about multiplicative operations.
p∘ also preserves biextensional equivalence in one direction. (Two equivalent frames in W will always be equivalent in V, but two inequivalent frames in W won't necessarily be inequivalent in V.)
Claim: If C≃D, then p∘(C)≃p∘(D).
Proof: Let C=(A,E,⋅) and let D=(B,F,⋆). Let (g0,h0):C→D and (g1,h1):D→C compose to something homotopic to the identity in both orders. We want to show that (g0,h0):p∘(C)→p∘(D) and (g1,h1):p∘(D)→p∘(C) compose to something homotopic to the identity in both orders. Indeed p(g1(g0(a))⋅e)=p(a⋅e) for all a∈A and e∈E, and p(g0(g1(b))⋆f)=p(b⋆f) for all b∈B and f∈F. □
We also have that p∘ preserves what's ensurable, where we transition from subsets of W to subsets of V in the obvious way.
Claim: Let p:W→V, and let p(S)={v∈V | ∃w∈S,p(w)=v}. If S∈Ensure(C), then p(S)∈Ensure(p∘(C)).
Proof: Trivial from the original definition of ensurables. □
We also get a stronger result when dealing with subsets of W and V that correspond exactly.
Claim: Let p:W→V, and let S⊆W and T⊆V be such that for all w∈W, we have p(w)∈T if and only if w∈S. Then S∈Ensure(C) if and only if T∈Ensure(p∘(C)), and S∈Ctrl(C) if and only if T∈Ctrl(p∘(C)).
Proof: Trivial from the original definitions of ensurables and controllables. □
The relationship between observability and functors is quite interesting. We will devote the next section to discussing this relationship and its philosophical consequences.
2. What's Observable is Relative to a Coarse World Model
Since observability is not closed under supersets, we can only really hope to get a result for observables in the stronger case where S⊆W and T⊆V correspond exactly; but interestingly, even then, the preservation result for observables is only one-directional.
Claim: Let p:W→V and let S⊆W and T⊆V be such that for all w∈W, we have p(w)∈T if and only if w∈S. Then if S∈Obs(C), then T∈Obs(p∘(C)).
Proof: If C≃C0&C1, with Image(C0)⊆S and Image(C1)⊆W∖S, then p∘(C)≃p∘(C0)&p∘(C1), and Image(p∘(C0))=p(Image(C0))⊆p(S)⊆T, while Image(p∘(C1))=p(Image(C1))⊆p(W∖S)⊆V∖T. □
The most interesting thing here is that the converse is not also true. There are examples where T∈Obs(p∘(C)), even though S∉Obs(C).
When p is surjective, we think of V as a coarse world model that forgets some details from W. Sometimes, an agent can be able to observe S relative to a coarse description of the world, but not in the more refined description, even in cases where S is definable in both the coarse and refined descriptions.
2.1. Example
Let us look at an example. In this example, the agent is an AI that will be given a number and asked whether it is prime or not. There are two possible environments E={Prime,Nonprime}.
The agent A has six strategies:
Finally, W={PAH,PAC,PIH,PIC,NAH,NAC,NIH,NIC}, where the first letter indicates whether the AI was given a prime or nonprime number, the second letter indicates whether the AI's answer was accurate or inaccurate, and the third letter indicates whether the AI is hot. The Cartesian frame, C, looks like this.
C=PrimeNonprimeAccurateHotInaccurateHotPrimeCoolNonprimeCoolPrimeHotNonprimeHot⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝ PAH NAH PIH NIH PAC NIC PIC NAC PAH NIH PIH NAH ⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠
We will let V be the coarse description of the world in which we only pay attention to the input/output behavior of the AI and ignore whether or not it becomes hot. V={PA,PI,NA,NI}, and we will let p:W→V be the function that deletes the third letter. This gives us the following for p∘(C).
p∘(C)=PrimeNonprimeAccurateHotInaccurateHotPrimeCoolNonprimeCoolPrimeHotNonprimeHot⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝ PA NA PI NI PA NI PI NA PA NI PI NA ⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠≃PrimeNonprimeAccurateInaccuratePrimeNonprime⎛⎜ ⎜ ⎜⎝ PA NA PI NI PA NI PI NA ⎞⎟ ⎟ ⎟⎠
The important thing to notice here is that {PA,PI}∈Obs(p∘(C))—when we ignore heat, the agent can base conditional strategies on whether the number is prime—but {PAH,PAC,PIH,PIC}∉Obs(C).
In particular, p∘(C)≃C0&C1, where
C0=PrimeAccurateInaccurate( PA PI ) and C1=NonprimeAccurateInaccurate( NA NI ),
while it is easy to see that {PAH,PAC,PIH,PIC}∉Obs(C), because there is no a∈if({PAH,PAC,PIH,PIC},PrimeCool,NonprimeCool).
2.2. Discussion
The above example illustrates something interesting about observables. It shows that what's observable is not only a function of the observing agent and the thing that is observed. It is also a function of the level of description of the world!
This makes sense because we are thinking of observation as the ability to implement conditional policies. To implement a conditional policy is to be indistinguishable from the constant policy a0 in worlds in S and indistinguishable from the constant policy a1 in worlds outside of S. This indistinguishability makes observables relative to the level of description of the world.
There is something internal to the agent that is different between the world where it implements a conditional policy and the world where it implements a constant policy. However, when we talk of S being an observable for the agent, we are working relative to a level of description that does not track that internal difference.
3. Functors from Cartesian Frames
When p:W→V is surjective, p∘ will send Cartesian frames over the more refined W to Cartesian frames over the less refined V. What if we want to go in the other direction?
While there is a unique function from less refined worlds to more refined worlds, there are many functions in the other direction. Luckily, we have an object that lets us deal with many functions at once.
Definition: Let C=(V,E,⋅) be a Cartesian frame over W, with Agent(C)=V. Then C∘:Chu(V)→Chu(W) is the functor that sends (B,F,⋆) to (B,F×E,⋄), where b⋄(f,e)=(b⋆f)⋅e, and sends the morphism (g,h) to (g,h′), where h′(f,e)=(h(f),e).
(Notice how this definition looks a bit like currying.)
Claim: C∘ is well-defined.
Proof: We need to show that C∘ actually sends objects and morphisms of Chu(V) to objects and morphisms of Chu(W), and that it preserves identity morphisms and composition.
C∘ clearly sends objects to objects. To see that it sends morphisms to morphisms, let (g,h):(B0,F0,⋆0)→(B1,F1,⋆1) be a morphism in Chu(V), let (Bi,Fi×E,⋄i)=C∘(Bi,Fi,⋆i), and let (g,h′)=C∘(g,h).
We want to show that (g,h′):(B0,F0×E,⋄0)→(B1,F1×E,⋄1) is a morphism, which is true because
g(b)⋄1(f,e)=(g(b)⋆1f)⋅e=(b⋆0h(f))⋅e=b⋄0(h(f),e)=b⋄0h′(f,e)for all b∈B0 and (f,e)∈F1×E. C∘ clearly preserves identity morphisms and composition. □
The coarse-to-refined functor C∘ preserves &, ⊤, and null, but not ⊕, 0, or −∗, which make sense, since C∘ is violating the symmetry between agent and environment.
Claim: C∘(⊤)=⊤, and