Knowledge is Freedom

Scott Garrabrant

[Epistemic Status: Type Error]

In this post, I try to build up an ontology around the following definition of knowledge:

To know something is to have the set of policies available to you closed under conditionals dependent on that thing.

You are an agent $G$ , and you are interacting with an environment $e$ in the set $E$ of all possible environments. For each environment $e$ , you select an action $a$ from the set $A$ of available actions. You thus implement a policy $p \in A^{E}$ . Let $P \subseteq A^{E}$ denote the set of policies that you could implement. (Note that $A^{E}$ is the space of functions from $E$ to $A$ .)

If you are confused about the word "could," that is okay; so am I.

A fact $(F, ϕ)$ about the enviornment can be viewed as a function $ϕ : E \to F$ that partitions the set of environments according to that fact. For example, for the fact "the sky is blue," we can think of $F$ as the set ${⊤, ⊥}$ and $ϕ$ as the function that sends worlds with a blue sky to the element $⊤$ and sends worlds without a blue sky to the element $⊥$ . One example of a fact is $(E, i d)$ which is the full specification of the environment.

A conditional policy can be formed out of other policies. To form a conditional on a fact $(F, ϕ)$ we start with a policy for each element of $F$ . We will let $c (f)$ denote the policy associated with $f \in F$ , so $c : F \to A^{E}$ . Given this fact and this collection of policies, we define the conditional policy $p_{c} : E \to A$ given by $e \mapsto c (ϕ (e)) (e)$ .

Conditional policies are like if statements in programming. Using the fact "the sky is blue" from above, we can let $k_{r}$ be the policy that pushes a red button regardless of its environment and let $k_{g}$ be a policy that pushes a green button regardless of its environment. If $c (⊤) = k_{r}$ and $c (⊥) = k_{g}$ , then $p_{c}$ is the policy that pushes the red button if the sky is blue, and pushes big green button otherwise.

Now, we are ready to define knowledge. If $P$ is the set of policies you could implement, then you know a fact $(F, ϕ)$ if $P$ is closed under conditional policies dependent on $F$ . (i.e. Whenever $c : F \to P$ , we have $p_{c} \in P$ .) Basically, we are just saying that your policy is allowed to break into different cases for different ways that the fact could go.

Self Reference

Now, let's consider what happens when an agent tries to know things about itself. For this, we will consider a naturalized agent, that is part of the environment. There is a fact $(A, a c t i o n)$ of the environment that says what action the agent takes, where $A$ is again the set of actions available to the agent, and $a c t i o n$ is a function from $E$ to $A$ that picks out what action the agent takes in that environment. Note that $a c t i o n$ is exactly the agent's policy, but we are thinking about it slightly differently.

So that things are not degenerate, let's assume that there are at least two possible actions $a$ and $b$ in $A$ , and that $P$ contains the constant policies $k_{a}$ and $k_{b}$ that ignore their evironment and always ouptut the same thing.

However, we can write down an explicit policy that the agent cannot implement: the policy where the agent takes action $b$ in environments in which it takes action $a$ , and takes action $a$ in environments in which it does not take action $a$ . The agent cannot implement this policy, since there are no consistant environments in which the agent is implementing this policy. (Again, I am confused by the coulds here, but I am assuming that the agent cannot take an inherently contradictory policy.)

This policy can be viewed as a conditional policy on the fact $(A, a c t i o n)$ . You can construct it as $p_{c}$ , where $c$ is the function that maps $a$ to $k_{b}$ and everything else to $k_{a}$ . The fact that this conditional policy cannot be in $P$ shows that the agent cannot by our definition know its own action.

Partial Knowledge

As seen above, there are limits to knowledge. This makes me want to aim lower and think about what types of partial knowledge can exist. Perhaps an agent can interact with a fact in nontrivial ways, while still not having complete knowledge defined above. Here, I will present various ways an agent can have partial knowledge of a fact.

In all of the below examples we will use a fact $({1, 2, 3, 4}, ϕ)$ about the environment that can take on $4$ states, an action that can take on four values $A = {a c, a d, b c, b d}$ , and we assume that the agent has access to the constant functions. Think about how all of these types of partial knowledge can be interpreted as changing the subet $P \subseteq A^{E}$ in some way.

Knowing a Coarser Fact: The agent could know a fact that has less detail than the original fact, for example the agent could know the parity of the fact above. This would mean that the agent can choose a policy to implement on worlds sent to $1$ or $3$ , and another policy to implement on worlds sent to $2$ or $4$ , but cannot necessarily use any more resolution.

Knowing a Logically Dependent Fact: The agent could, for example, know another fact $({1, 2, 3, 4, ⊥}, ϕ^{'})$ with the property that $ϕ^{'} (e) = ϕ (e)$ whenever $ϕ^{'} (e) \neq ⊥$ . The agent can safely do policies when it knows it is in states $1$ through $4$ , but it also might be in a state of uncertainty, and know the environment is $⊥$ .

Knowing a Probabilistically Dependent Fact: The agent could, for example, know another fact $({1, 2, 3, 4}, ϕ^{'})$ , which is almost the same as the original fact, but is wrong in some small number of environments. The agent cannot reliably implement functions dependent on the original fact, but can correlate its action with the original fact by using this proxy.

Learning a Fact Later in Time: Imagine the agent has to make two independent actions at two different times, and the agent learns the fact after the first action, but before the second. In the above example, the first letter of the action, $a$ or $b$ , is the first action, and the second letter, $c$ or $d$ , is the second action. The policies are closed under conditionals as long as the different policies in the conditional agree on the first action. This is particularly interesting because it shows how to think of an agent moving through time as a single timeless agent with partial knowledge of the things that it will learn.

Paying Actions to Learn a Fact: Similar to the above example, imagine that an agent will learn the fact, but only if it chooses $a$ in the first round. This corresponds to being closed under conditionals as long as all of the policies always choose $a$ in the first round.

Paying Internal Resources to Learn a Fact: Break the fact up into two parts: the parity of the number, and whether the numer is greater than $2$ . Imagine an agent that is in an epistemic state such that it could think for a while and learn either of these bits, but cannot learn both in time for when it has to take an action. The agent can depend its policy on the parity or the size but not both. Interestingly, this agent has strictly more options than an agent that only knows the parity, but technically does not fully know the parity. This is because adding more options can take away the closure property on the set of policies.

Other Subsets of the Function Space: One could imagine for example starting with an agent that knows the fact, but specifying one specific policy that the agent is not allowed to use. It is hard to imagine this as an epistemic state of the agent, but things like this might be necessary to talk about self reference.

Continuous/Computable Functions: This does not fit with the above example, but we could also restrict the space of policies to e.g. computable or continuous function of the environment, which can be viewed as a type of partial knowledge.

Confusing Parts

I don't know what the coulds are. It is annoying that our definition of knowledge is tied up with something as confusing as free will. I have a suspicion, however, that this is necessary. I suspect that our trouble with understanding naturalized world models might be coming from trying to understand them on their own, when really they have a complicated relationship with decision theory.

I do not yet have any kind of a picture that unifies this with the other epistemic primitives, like probability and proof, and I expect that this would be a useful thing to try to get.

It is interesting that one way of thinking about what the coulds are is related to the agent being uncertain. In this model, the fact that the agent could take different actions is connected to the agent not knowing what action it takes, which interestingly matches up with the fact in this model, if an agent could take multiple actions, it can't know which one it takes.

It seems like an agent could effectively lose knowledge by making precommitments not to follow certain policies. Normal kinds of precommitments like "if you do $X$ , I will do $Y$ " do not cause the agent to lose knowledge, but the fact it can in theory is weird. Also, it is weird that an agent that can only take one action vacuously knows all things.

It seems like to talk about knowing what you know, you run into some size problems. If the thing you know is treated as a variable that can take different values, that variable lives in the space of subsets of functions from environments to actions, $2^{A^{E}}$ which is much larger than $E$ . I think to talk about this you have to start out restricting to some subset of functions from the beginning, or some subset of possible knowledge states.

Sorry if this sounds naive, but why try to frame knowledge this way? It seems like you're jumping through a lot of hoops so you can define it in terms of decision theory primitives but it doesn't seem a great fit.

I am confused about decision theory, and entering into a new ontology is a way to maybe look at in a new way and become less confused. This ontology specifically feels promising to me, but that is hard to comunicate.

There is an intuition that if you know what you do, that is because you already decided on your action. However, when you think about proofs, that doesn't work out and you get logical counterfactuals. This ontology feels closer to telling me why if you know your action, you already decided.

Seperately, I think decision theory has to either be prior to or simultaneous with epistemics. If you live in a world where you have access to magic if and only if you believe that you can use magic, you should believe you can do magic. You cant do that unless decision theory comes before epistemics.

If you live in a world where you have access to magic if and only if you believe that you can use magic, you should believe you can do magic. You cant do that unless decision theory comes before epistemics.

I think decision theory is for situations where the world judges you based on your decision. (We used to call such situations "fair".) If the world can also judge your internal state, then no decision theory can be optimal, because the world can just punish you for using it.

Do you have any other argument why decision theory should come before epistemics?

I think that spurious counterfactuals come from having a proof of what you do before deciding what you do, (where "before" is in some weird logical time thing)

I think that the justification for having particular epistemics should come from decision/utility theory, like with the complete class theorems.

I think the correct response to Sleeping Beauty is to taboo "belief" and talk about what gambles you should take.

I think that we have to at some point think about how to think about what to think about, which requires the decision system influencing the epistemics.

#2 and #3 just sound like UDT to me, but #1 and #4 are strong. Thank you! I agree that deciding which theorems to prove next is a great use of decision theory, and would love to see people do more with that idea.

"I think that we have to at some point think about how to think about what to think about"

My inner Eliezer is screaming at me about ultrafinite recursion.

Seperately, I think decision theory has to either be prior to or simultaneous with epistemics. If you live in a world where you have access to magic if and only if you believe that you can use magic, you should believe you can do magic. You cant do that unless decision theory comes before epistemics.

I take it you mean to say here you think normative decision theory comes before normative epistemics?

Or are you trying to express a position similar to the one I take but in different language, which is that phenoma come first? I can very much see a way in which this makes sense if we talk about choice as the same thing as intentional experience, especially since from the inside experience feels like making a choice between which possible world (branch) you come to find yourself in.

Yeah, I think I mean normative DT comes before normative epistemics. I guess I have two claims.

The first is that an agent should have its DT system interacting with, or inside its epistemic system in some way. This is opposed to a self-contained epistemic system at the inner core, and a decision system that does stuff based on the results of the epistemic system.

The second is that we are confused about DT and naturalized world models, and I suspect that progress unpacking that confusion can come from abandoning this "epistemics first, decision second" view and working with both at the same time.

See also my response to cousin_it.

Ah, okay, I think that makes a lot of sense. I actually didn't realize viewing things as epistemics first was normal in decision theory, although now that I think about it the way the model moves complexity into the agent to avoid dealing with it naturally is going to cause it to leave questions of where knowledge comes from underaddressed.

As I stated above, I think a choice first approach is also sensible because it allows you to work with something fundamental, choice/interaction/phenomena, rather than something that is created by agents, knowledge. Look forward to where you go with this. Feel free to reach out if you want to discuss this more, as I think you are bumping into things I've just had to go through dealing with from a different perspective to make progress in my own work, but there is likely more to be learned there.

Also, it is weird that an agent that can only take one action vacuously knows all things.

You could want to define not "agent A knows fact $F$ ", but "agent A can counterfactually demonstrate that it knows fact $F$ ". So the agent with a single action can't demonstrate anything.

All that we'd need to add to the definition is the fact that there exists policies in $P$ that distinguish elements of $F$ , ie that for all $f_{i}, f_{j} \in F$ , with $f_{i} \neq f_{j}$ , there exists $e_{i} \in ϕ (f_{i})$ and $e_{j} \in ϕ (f_{j})$ and a $p \in P$ with $p (e_{i}) \neq p (e_{j})$ .

Meta: The word count is very off on this post. I currently see it as 73K. I am not sure what happened, but I believe:

I made a small edit.

I pressed submit.

It appeared that nothing happened.

I pressed submit a bunch of times.

It still appeared that nothing happened.

I went back, and looked at the post.

The edit was made, but the word count became huge.

Ah, sorry. This was us breaking the word-count for LaTeX equations. We have a fix for this in the works.

Perhaps and agent --> an agent

I think things will come together more if you switch from treating facts as partitions of real external parallel worlds, to partitions of models of the world inside your head. Worlds aren't probabilistic (more or less), but models can be. Proofs don't change what math is true, but they can change what math you need to model as true. Etc.

I agree that the "environment" here should be thought of as the agent's subjective beliefs about the environment. The "coulds" have to be a sort of subjective possibility. I suspect "coulds" are just what cannot be ruled out by an underlying proof process, and probabilities are the "caring function" over the remaining possibilities which allows choice.

If you live in a world where you have access to magic if and only if you believe that you can use magic, you should believe you can do magic. You cant do that unless decision theory comes before epistemics.

Do you have any other argument why decision theory should come before epistemics?

I think that spurious counterfactuals come from having a proof of what you do before deciding what you do, (where "before" is in some weird logical time thing)

I think that the justification for having particular epistemics should come from decision/utility theory, like with the complete class theorems.

I think the correct response to Sleeping Beauty is to taboo "belief" and talk about what gambles you should take.

I think that we have to at some point think about how to think about what to think about, which requires the decision system influencing the epistemics.

"I think that we have to at some point think about how to think about what to think about"

My inner Eliezer is screaming at me about ultrafinite recursion.

Seperately, I think decision theory has to either be prior to or simultaneous with epistemics. If you live in a world where you have access to magic if and only if you believe that you can use magic, you should believe you can do magic. You cant do that unless decision theory comes before epistemics.

I take it you mean to say here you think normative decision theory comes before normative epistemics?

Yeah, I think I mean normative DT comes before normative epistemics. I guess I have two claims.

LESSWRONG
LW

LESSWRONG
LW

33

Knowledge is Freedom

33

Ω 14

Self Reference

Partial Knowledge

Confusing Parts

33

Ω 14

33

Ω 14