# Ω 21

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Here, we introduce and discuss the concept of a subagent in the Cartesian Frames paradigm.

Note that in this post, as in much of the sequence, we are generally working up to biextensional equivalence. In the discussion, when we informally say that a frame has some property or is some object, what we'll generally mean is that this is true of its biextensional equivalence class.

## 1. Definitions of Subagent

1.1. Categorical Definition

Definition: Let  and  be Cartesian frames over . We say that 's agent is a subagent of 's agent, written , if for every morphism  there exists a pair of morphisms  and  such that .

Colloquially, we say that every morphism from  to  factors through . As a shorthand for "'s agent is a subagent of 's agent," we will just say " is a subagent of ."

At a glance, it probably isn't clear what this definition has to do with subagents. We'll first talk philosophically about what we mean by "subagent", and then give an alternate definition that will make the connection more clear.

When I say "subagent," I am actually generalizing over two different relationships that may not immediately seem like they belong together.

First, there is the relationship between the component and the whole. One football player is a subagent of the entire football team.

Second, there is the relationship between an agent before and after making a precommitment or a choice. When I precommit not to take a certain action, I am effectively replacing myself with a weaker agent that has fewer options. The new agent with the commitment is a subagent of the original agent.

These are the two notions I am trying to capture with the word "subagent". I am making the philosophical claim that we should think of them primarily as one concept, and am partially backing up this claim by pointing to the simplicity of the above definition. In a future post, we will discuss the formal differences between these two kinds of subagent, but I think it is best to view them as two special cases of the one simple concept.

(My early drafts of the "Embedded Agency" sequence used the word "subagent" in the title for both the Subsystem Alignment and Robust Delegation sections.)

1.2. Currying Definition

Definition: Let  and  be Cartesian frames over . We say that  if there exists a Cartesian frame  over  such that .

Assume for this discussion that we only care about frames up to biextensional equivalence. In effect, the above definition is saying that " is a subagent of " means "'s agent is playing a game, , where the stakes are to help decide what 's agent does." (And this game may or may not have multiple players, and may or may not fully cover all the options of 's agent.)

Letting  and , it turns out (as we will see later) that we can explicitly construct , where  is the set of all morphisms from  to , and  is given by .

We will later prove the categorical and currying definitions equivalent, but let's first interpret this definition using examples.

is a Cartesian frame whose agent is the agent of  and whose world is the agent of . This seems like the kind thing we would have when  is a subagent of .

Thinking about the football example: We have the football player  as the agent in a Cartesian frame  over the world . We also have the football team  as the agent in a Cartesian frame  over the same world .

is a Cartesian frame over the football team; and the agent of this frame is again the football player , the environment of , represents the rest of the football team: the player's effect on the team as a whole (here treated as the player's world) is a function of what the player chooses and what the rest of the team chooses. We can think of  as representing a  "zoomed-in" picture of  interacting with its local environment (the team), while  represents a "zoomed-out" picture of  interacting with its teammates and the larger world (rival teams, referees, etc.).

, so  is equivalent to , which is saying that the environment for the football player in its original frame () is equivalent to the Cartesian product of the rest of the team  with the team's environment .

Thinking about the precommitment example:  has made a precommitment, so there is an inclusion morphism , which shows that 's agent's options are a subset of  agent's options.  is just , so  is a singleton. , so  is equivalent to , so here  is a subset of  and  is equivalent to .

Although the word "precommitment" suggests a specific (temporal, deliberative) interpretation, formally, precommitment just looks like deleting rows from a matrix (up to biextensional equivalence), which can represent a variety of other situations.

A Cartesian frame  over  is like a nondeterministic function from  to , where  represents the the nondeterministic bits. When changing our frame from  to , we are identifying with  and externalizing the nondeterministic bits  into the environment.

1.3. Covering Definition

The categorical definition is optimized for elegance, while the currying definition is optimized to be easy to understand in terms of agency. We have a third definition, the covering definition, which is optimized for ease of use.

Definition. Let  and  be Cartesian frames over . We say that  if for all , there exists an  and a  such that .

We call this the covering definition because the morphisms from  to  cover the set .

## 2. Equivalence of Definitions

2.1. Equivalence of Categorical and Covering Definitions

The equivalence of the categorical and covering definitions follows directly from the fact that the morphisms from  to  are exactly the elements of .

Claim: The categorical and covering definitions of subagent are equivalent.

Proof: Let  and let . First, observe that the morphisms from  to  correspond exactly to the elements of . For each , it is easy to see that , given by  and , is a morphism, and every morphism is uniquely defined by , so there are no other morphisms. Let  denote the morphisms with .

Similarly, the morphisms from  to  correspond to the elements of . Let  denote the morphisms corresponding to .

Thus, the categorical definition can be rewritten to say that for every morphism , there exist morphisms  and , such that . However,  sends  to , and so equals  if and only if . Thus the categorical definition is equivalent to the covering definition.

2.2. Equivalence of Covering and Currying Definitions

Claim: The covering definition of subagent implies the currying definition of subagent.

Proof: Let  and  be Cartesian frames over . Assume that  according to the covering definition.

Let  be the set of all morphisms from  to , and let  be a Cartesian frame over , with  given by . We have that , with

for all , and .

To show that , we need to construct morphisms  and  which compose to something homotopic to the identity in both orders.

We will let  and  be the identity on , and we let  be given by . Finally, we let  such that . We can always choose such a  and  by the covering definition of subagent.

We have that  is a morphism, since

Similarly, we have that  is a morphism since , where , so

It is clear that  and  compose to something homotopic to the identity in both orders, since  and  are the identity on . Thus,

Claim: The currying definition of subagent implies the covering definition of subagent.

Proof: Let  and  be Cartesian frames over . Let  be a Cartesian frame over , and let . Our goal is to show that for every , there exists a  and  such that . We will start with the special case where .

We have that , where . First, note that for every , there exists a morphism  given by , and . To see that this is a morphism, observe that

for all  and .

To show that  according to the covering definition, we need that for all , there exists an  and a  such that . Indeed we can take  and .

Now, we move to the case where , but . It suffices to show that under the covering definition of subagent, if , and , then .

Let , and let  and  compose to something homotopic to the identity in both orders. Assume that . To show that , let the possible environment  be arbitrary.

, so there exists an  and  such that . Consider the morphism , where , and  and  on all . To see that this is a morphism, observe that for all , we have

while for , we have

Now, notice that for our arbitrary  and  satisfy , so  according to the to the covering definition.

Thus, whenever , we have  according to the covering definition, so the currying definition implies the covering definition of subagent.

## 3. Mutual Subagents

The subagent relation is both transitive and reflexive. Surprisingly, this relation is not anti-symmetric, even up to biextensional equivalence.

Claim:  is reflexive. Further, if , then .

Proof: Let  and  be Cartesian Frames over , with . Consider the Cartesian frame  over  given by , where . Observe that . Thus , so , according to the currying definition.

Claim:  is transitive.

Proof: We will use the categorical definition. Let  and . Given a morphism, , since , we know that  with  and . Further, since , we know that  with  and . Thus,

with  and , so

As a corollary, we have that subagents are well-defined up to biextensional equivalence.

Corollary: If , and , then .

Proof:

Sometimes, there are Cartesian frames  with  and . We can use this fact to define a third equivalence relation on Cartesian frames over , weaker than both  and .

Definition: For Cartesian frames  and  over , we say  if  and .

Claim:  is an equivalence relation.

Proof: Reflexivity and transitivity follow from reflexivity and transitivity of . Symmetry is trivial.

This equivalence relation is less natural than  and , and is not as important. We discuss it mainly to emphasize that two frames can be mutual subagents without being biextensionally equivalent.

Claim:  is strictly weaker than , which is strictly weaker than .

Proof: We already know that  is weaker than . To see that  is weaker than , observe that if , then  and , so .

To see that  is strictly weaker than , observe that  (both have empty environment and nonempty agent), but  (the agents have different size).

To see that  is strictly weaker than , observe that  (vacuous by covering definition), but  (there are no morphisms from  to ).

I do not have a simple description of exactly when , but there are more cases than just the trivial ones like  and vacuous cases like . As a quick example:

.

To visualize this, imagine an agent that is given the choice between cake and pie. This agent can be viewed as a team consisting of two subagents, Alice and Bob, with Alice as the leader.

Alice has three choices. She can choose cake, she can choose pie, or she can delegate the decision to Bob. We represent this with a matrix where Bob is in Alice's environment, and the third row represents Alice letting the environment make the call:

.

If we instead treat Alice-and-Bob as a single superagent, then their interaction across the agent-environment boundary becomes agent-internal deliberation, and their functional relationship to possible worlds just becomes a matter of "What does the group decide?". Thus, Alice is a subagent of the Alice-and-Bob team:

.

However, Alice also has the ability to commit to not delegating to Bob. This produces a future version of Alice that doesn't choose the third row. This new agent is a precommitment-style subagent of the original Alice, but using biextensional collapse, we can also see that this new agent is equivalent to the smaller matrix. Thus:

.

It is also easy to verify formally that these are mutual subagents using the covering definition of subagent.

I'm reminded here of the introduction and deletion of mixed strategies in game theory. The third row of Alice's frame is a mix of the first two rows, so we can think of Bob as being analogous to a random bit that the environment cannot see. I informally conjecture that for finite Cartesian frames,  if and only if you can pass between  and  by doing something akin to deleting and introducing mixed strategies for the agent.

However, this informal conjecture is not true for infinite Cartesian frames:

.

We can see that these frames are mutual subagents by noting that one can transition back and forth by repeatedly committing not to take the top row.

I do not know of any examples of  that look qualitatively different from those discussed here, but I do not have a good understanding of exactly what the equivalence classes look like.

## 4. Universal Subagents and Superagents

We can view  as a universal subagent and  as a universal superagent.

Claim:  for all Cartesian frames .

Proof: We use the categorical definition. That  is vacuous, since there is no morphism from  to . That  is also trivial, since any  is equal to

Since , we also have  for all .

We also have a that  is a superagent of all Cartesian frames with image in .

Claim:  if and only if .

Proof: Let , and let , with .

First, assume . We will use the covering definition. Given an , let  be given by  and . We have that  is well-defined because , and  is a morphism because for all

Thus, there is a morphism  and an element  such that  for an arbitrary , so .

Conversely, assume , so let  and  be such that . If we assume for contradiction that , then by the covering definition, there must be a morphism  such that . But then we have that

must be both inside and outside of , a contradiction.

Convention: We will usually write  instead of , as it is shorter.

Corollary:  if and only if  for some  and .

Proof: This is just rewriting our definition of observables from "Controllables and Observables, Revisited."

In the coming posts, we will introduce multiplicative operations on Cartesian frames, and use these to distinguish between additive and multiplicative subagents and superagents.

# Ω 21

5 comments, sorted by Highlighting new comments since
New Comment

I am trying to check that I am understanding this correctly by applying it, though probably not in a very meaningful way:

Am I right in reasoning that, for  , that  iff ( (C can ensure S), and (every element of S is a result of a combination of a possible configuration of the environment of C with a possible configuration of the agent for C, such that the agent configuration is one that ensures S regardless of the environment configuration)) ?

So, if S = {a,b,c,d} , then

would have  , but, say

would have   , because , while S can be ensured, there isn't, for every outcome in S, an option which ensures S and which is compatible with that outcome ?

Yep.

There is a single morphism from  to  for every world in , so  means all of these morphism factor through

A morphism from  to  is basically a column of  and a morphism from  to  is basically an row in , all of whose entries are in , and these compose to the morphism corresponding to the entry where this column meets this row.

Thus  if and only if when you delete all rows not entirely in , the resulting matrix has image .

I think this equivalent to what you said. I just wrote it out myself because that was the easiest way for me to verify what you said.

Thanks! (The way you phrased the conclusion is also much clearer/cleaner than how I phrased it)

Given that C ◃ C, I kind of wish that the triangle had a line under it, so that I didn't think it might represent a strict relationship.

I am very experienced in category theory but not the Chu construction (or *-autonomous categories in general). There is a widely used notion of subobject of an object  in a category  as "equivalence class of monomorphisms with codomain ". This differs from your definition most conspicuously in the case of  where there is no morphism from this frame to a typical frame.

If I'm calculating correctly, the standard notion of subobject is strictly stronger than the one you present here (as long as the world  is inhabited, and even in that case I think the construction collapses enough to make it true) since monomorphisms are morphisms which are injective in their agent argument and surjective in their environment argument, and we can extend any morphism to  along such a monomorphism.

Now, I notice that you refer to the concepts in this post as subagents rather than subframes, so perhaps you were deliberately avoiding this stronger concept. Intuitively, a subframe in the sense I describe above consists of an agent with a subset of the available options and who may not be able to distinguish between some of the environments present in a larger frame; the "precommitted agent" you mention early on here seems to be a special case of this which is the identity in the environment component. Incidentally, the equivalence relation corresponding to this notion of subobject corresponds to isomorphism in the finite case but is non-trivial for a similar reason to the case you described of infinite frames.

I wonder if you have any thoughts about how these notions compare? It's clear from the discussion that you chose a definition which reflected what you wanted to express, which is always good, but on the other hand the monomorphisms I described will crop up when you consider factorizations of the morphisms in your category more generally. Perhaps they could be useful to you.