Controlling Constant Programs

25


This post explains the sense in which UDT and its descendants can control programs with no parameters, without using explicit control variables.

Related to: Towards a New Decision Theory, What a reduction of "could" could look like.

Usually, a control problem is given by an explicit (functional) dependence of outcome on control variables (together with a cost function over the possible outcomes). Solution then consists in finding the values of control variables that lead to the optimal outcome. On the face of it, if we are given no control variables, or no explicit dependence of the outcome on control variables, then the problem is meaningless and cannot be solved.

Consider what is being controlled in UDT and in the model of control described by Vladimir Slepnev. It might be counterintuitive, but in both cases the agent controls constant programs, in other words programs without explicit parameters. And for constant programs, their output is completely determined by their code, nothing else.

Let's take, for example, Vladimir Slepnev's model of Newcomb's problem, written as follows:

def world(): 
  box1 = 1000
  box2 = (agent() == 2) ? 0 : 1000000
  return box2 + ((agent() == 2) ? box1 : 0)

The control problem that the agent faces is to optimize the output of program world() that has no parameters. It might be tempting to say that there is a parameter, namely the sites where agent() is included in the program, but it's not really so: all these entries can be substituted with the code of program agent() (which is also a constant program), at which point there remains no one element in the program world() that can be called a control variable.

To make this point more explicit, consider the following variant of program world():

def world2(): 
  box1 = 1000
  box2 = (agent2() == 2) ? 0 : 1000000
  return box2 + ((agent() == 2) ? box1 : 0)

Here, agent2() is a constant program used to predict agent's decision, that is known to compute the same output as agent(), but does not, generally, resemble agent() in any other way. If we try to consider only the explicit entry of program agent() as control variable (either by seeing the explicit program call in this representation of world2(), or by matching the code of agent() if its code was substituted for the call), we'll end up with an incorrect understanding of the situation, where the agent is only expected to control its own action, but not the prediction computed by agent2().

Against explicit dependence

What the above suggests is that dependence of the structure being controlled from agent's decision shouldn't be seen as part of problem statement. Instead, this dependence should be reconstructed, given definition of the agent and definition of the controlled structure. Relying on explicit dependence, even if it's given as part of problem statement, is actually detrimental to the ability to correctly solve the problem. Consider, for example, the third variant of the model of Newcomb's problem, where the agent is told explicitly how its action (decision) is used:

def world3(action): 
  box1 = 1000
  box2 = (agent2() == 2) ? 0 : 1000000
  return box2 + ((action == 2) ? box1 : 0)

Here, agent2() is the predictor, and whatever action the program agent() computes is passed as a parameter to world3(). Note that the problem is identical to one given by world2(), but you are explicitly tempted to privilege the dependence of the outcome of world3() on its parameter that is computed by agent(), over the dependence on the prediction computed by agent2(), even though both are equally important for correctly solving the problem.

To avoid this explicit dependence bias, we can convert the problem given with an explicit dependence to one without, by "plugging in" all parameters, forgetting about the seams, and posing a problem of restoring all dependencies from scratch (alternatively, of controlling the resulting constant program):

def world3'(): 
  return world3(agent())

Now, world3'() can be seen to be equivalent to world2(), after the code of world3() is substituted.

Knowable consequences

How can the agent control a constant program? Isn't its output "already" determined? What is its decision about the action for, if the output of the program is already determined?

Note that even if the environment (controlled constant program) determines its output, it doesn't follow that the agent can figure out what that output is. The agent knows a certain number of logical facts (true statements) and can work on inferring more such statements, but it might not be enough to infer the output of environment. And every little bit of knowledge helps.

One kind of statements in particular has an unusual property: even though the agent can't (usually) infer their truth, it can determine their truth any way it likes. These are statements about agent's decision, such as (agent() == 1). Being able to know their truth by the virtue of determining it allows the agent to "infer" the truth of many more statements than it otherwise could, and in particular this could allow inferring something about the output of environment (conditional on the decision being so and so). Furthermore, determining which way the statements about the agent go allows to indirectly determine which way the output of environment goes. Of course, the environment already takes into account the actual decision that the agent will arrive at, but the agent normally doesn't know this decision beforehand.

Moral arguments

Let's consider an agent that reasons formally about the output of environment, and in particular about the output of environment given possible decisions of the agent. Such an agent produces (proves) statements in a given logical language and theory, and some of the statements are "calls for action", that is by proving such statements, the agent is justified in taking an action associated with them.

For example, with programs world2() and agent() above, where agent() is known to only output either 1 or 2, one such statement is:

[agent()==1 => world2()==1000000] AND [agent()==2 => world2()==1000]

This statement is a moral argument for deciding (agent()==1). Even though in the statement itself, one of the implications must be vacuously true by virtue of its antecedent being false, and so can't say anything about the output of environment, the implication following from that actually chosen action is not vacuous, and therefore choosing that action simultaneously decides the output of environment.

Consequences appear consistent

You might also want to re-read Vladimir Slepnev's post on the point that consequences appear consistent. Even though most of the implications in moral arguments are vacuously true (based on false premise), the agent can't prove which ones, and correspondingly can't prove a contradiction from their premises. Let's say the agent proves two statements implying different consequences of the same action, such as

[agent()==2 => world2()==1000] and 
[agent()==2 => world2()==2000].

Then, it can also prove that (agent()==2) implies a contradiction, which normally can't happen. As a result, consequences from possible actions appear as consistent descriptions of possible worlds, even though they are not. For example, one-boxing agent in Newcomb's problem can prove

[agent()==2 => world2()==1000],

but (world2()==1000) is actually inconsistent.

This suggests a reduction of the notion of (impossible) possible worlds (counterfactuals) generated by possible actions of the agent that are not taken. Just like explicit dependencies, such possible worlds can be restored from definition of (actual) environment, instead of being explicitly given as part of the problem statement.

Against counterfactuals

Since it turns out that both dependencies, and counterfactual environments are implicit in the actual environment, they don't need to be specified in the problem statement. Furthermore, if counterfactual environments are not specified, their utility doesn't need to be specified as well: it's sufficient to specify the utility of actual environment.

Instead of valuing counterfactuals, the agent can just value the actual environment, with "value of counterfactuals" appearing as an artifact of the structure of moral arguments.

Ambient control

I call this flavor of control ambient control, and correspondingly the decision theory studying it, ambient decision theory (ADT). This name emphasizes how the agent controls the environment not from any given location, but potentially from anywhere, through the patterns whose properties can be inferred from statements about agent's decisions. A priori, the agent is nowhere and everywhere, and some of its work consists in figuring out how exactly it exerts its control.

Prior work

In discussions on Timeless decision theory, Eliezer Yudkowsky suggested the idea that agent's decision could control the environment through more points than agent's "actual location", which led to the question of finding the right configuration of explicit dependencies, or counterfactual structure, in the problem statement (described by Anna Salamon in this post).

Wei Dai first described a decision-making scheme involving automatically inferring such dependencies (and control of constant programs, hence an instance of ambient control) known as Updateless decision theory. In this scheme, the actual inference of logical consequences was extracted as an unspecified "mathematical intuition module".

Vladimir Slepnev figured out an explicit proof-of-the-concept algorithm successfully performing a kind of ambient control to solve the problem of cooperation in Prisoner's Dilemma. The importance of this algorithm is in showing that the agent can in fact successfully prove the moral arguments necessary to make a decision in this nontrivial game theory problem. He then abstracted the algorithm and discussed some of the more general properties of ambient control.

25