ACI #3: The Origin of Goals and Utility

Akira Pyinya

Goal and Utility are central ideas in the rational agent approach of AI, in which the meaning of intelligence is to achieve goals or, more explicitly, to maximize expected utility.

What goal or utility function should an AI choose? This question is meaningless in the rational agents framework. It's like asking what program a computer should run, the answer is a universal computer can run any well-written program.

However, the rational agent is an idealization of real-world intelligence. The ACI model argues that either pursuing a goal or maximizing utility is an imprecise description of intelligence behaviors that try to follow the precedent. The rational agent model is a quasistatic approximation of the ACI model.

Following this statement, we are trying to derive goals and utility functions from the principles of ACI.

(The previous version has many errors, so I rewrite this chapter)

Goals = Futures that resemble the precedent

The right thing for an ACI agent is to follow the precedent, while the right thing for a rational agent is to achieve goals. Since a goal is a desired future, we can speculate that the best goal-directed approximation to the ACI model is a desired future that follows the precedent.

Consider an agent G that keeps doing the right things. We call doing the right thing the precedent. It is reasonable to assume that if G continues to behave in the same way, it is likely to continue to do the right things in the environments it has experienced.

If G is goal-directed, its goal should be a future that resembles the precedent as closely as possible. If the precedent is a sequence made of observations and actions, the goals should be the best possible continuation of that sequence. This brings us to the conclusion:

Setting goals for an agent is the same process as predicting the sequence of the precedent.

A formal description is given in the appendix at the end of the post.

A goal should have the following properties:

An agent can have multiple, possibly infinite goals, because there will be a goal at every future moment. Compromising between multiple goals would be difficult.
A goal may not always represent the right future even if it can be achieved. It has the highest probability of being the right future, but it can also turn to be out wrong.
When a goal is achieved, the agent may or may not receive notifications that if things are actually right. The information about right or wrong can be presented in any form, including real-time feedback and delayed notification. For example, video game players may not be notified whether they have won or lost until the end of a round . As a universal model of intelligence, ACI determines what is right without relying on any particular mechanism, be it natural selection or artificial control.

That's why an agent can't act directly with goals. It's more conventional to use expected utility to describe an agent's behavior.

Utility = The probability of doing the right things

People prefer thinking in goals, but working with utilities.

Since a goal should be assigned the highest expected utility among all possible worlds, and a goal is defined as a future world that has the highest probability of becoming a precedent/doing the right things, it is reasonable to define the expected utility in terms of the probability of becoming a precedent/doing the right things.

In other words, the utility of a future world is its probability of being the continuation of the precedent sequence.

It is easy to prove that ACI's definition of utility obeys the four axioms of VNM-rationality : completeness, transitivity, continuity, and independence. We can also define the total expected utility as a value.

FAQs

Q: OK, following the precedent might be right, but what if the agent lives in a carefree scenario, where doing everything is right?

A: If doing everything is right, the agent is more likely to follow simple policies than complex ones, so the precedent is most likely to be a simple sequence, such as continuing one action or just reflexes to the environment. On the other hand, if we can find rather complicated structures in the precedent, it is highly unlikely that the agent is in a carefree situation.

Q: With well-defined utility functions, should ACI maximize the expected utility like a rational agent or AIXI?

A: Not really. In relatively stable environments, rational agents can serve as acceptable approximations of ACI agents. However, they are likely to encounter the alignment problem when faced with unanticipated scenarios:

As soon as the precedent receives new data points, the utility function changes, making it unsuitable for straightforward optimization.
Up to this point, we have been discussing ideal ACI agents that have unlimited computing power and memory, and are able to achieve any possible future goal. However, real-world agents cannot perform Solomonoff Induction due to the inherent uncomputability of Solomonoff Induction. Only a constrained version of ACI, known as ACItl, can be implemented on practical computers. Once an ACItl agent receives an improvement in its performance level, it will change its approximation of the utility functions.

Appendix:

Define History, World, and Precedent

In the beginning we can have a formal definition of history, world, and doing the right things.

There is an agent interacts with an unknown environment in time cycles $k = 1, 2, 3, . . . t$ . In cycle $k$ , $x_{k}$ is the perception (input) from the environment, and $y_{k}$ is the action (output) of the agent.

Define agent's interaction History $h_{< t} \equiv x_{1} y_{1} x_{2} y_{2} . . . x_{t - 1} y_{t - 1}$ , while the possible Worlds with history $h_{< t}$ is $w_{< n} \equiv x_{1} y_{1} x_{2} y_{2} . . . x_{t - 1} y_{t - 1} . . . x_{n - 1} y_{n - 1}$ . Worlds are stratified by histories.

Let $H$ be the set of histories, and $W$ be the set of worlds. For any $h \in H$ , there is a subset $W_{h} \subset W$ consisting of all worlds with history $h$ (Armstrong 2018) .

Define Judgment Function as a function from a world or a history to 1 or 0 :

$J : W \cup H \to {0, 1}$

The history of doing the right things should have $J (h) = 1$ .

We can define a Precedent as a history of doing the right things:

Definition 1 (Precedent). A precedent is a history $h^{*}$ ,

$\forall h_{< k}^{*} \subseteq h^{*}$ $J (h_{k}^{*}) \equiv 1$

Any subsets of a precedent is also a precedent.

For an ACI agent, the precedent contains all the information we have about what is right, thus it set a standard for right things. The right future world that will become a precedent should meet this standard.

Define Goals

A goal can be defined as a future world that has the highest likelihood of doing the right things or becoming a precedent.

Definition 2 (Goal). At time $m$ , given a precedent $h_{< t}^{*}$ , the goal of time $n$ ( $n > m \geq t$ )is a world $w_{< n}^{*} \in W_{h_{< m}} \subseteq W_{h_{< t}^{*}}$ that

$\forall w_{< n} \in W_{h_{< m}} P (J (w_{< n}^{*}) = 1 | w_{< n}^{*}) \geq P (J (w_{< n}) = 1 | w_{< n})$

There is a simple and intuitive theorem about goals:

Theorem 1: An agent's goal given a precedent $h^{*}$ equals the most possible continuation of the precedent sequence.

Proof is given at the end of the post. With this theorem, the goal calculation problem turns out to be a sequence prediction task. Following Hutter's AIXI, ACI uses Solomonoff Induction as an all purpose sequence prediction tool. Solomonoff Induction considers all possible hypotheses about a sequence, and continuously updates the estimate of the probability of each hypothesis.

Define Utility Function, Values, and Reward

Utility function is defined as a function from worlds to real numbers:

$U : W \to R$

Definition 3 (Expected Utility). The expected utility of any possible world $w_{< n} \in W_{h_{< t}^{*}}$ is its probability of doing the right thing:

$U_{h_{< t}^{*}} (w_{< n}) \equiv P (J (w_{< n}) = 1 | w_{< n})$

In other words, the utility of a world is its probability of doing the right thing given a known precedent was doing the right thing.

We will calculate the utility function using Solomonoff Induction in the last part of this article.

We can also define the total expected utility as value:

Definition 4 (Value) Total expected utility or value for a policy $π$ and history $h_{< n}$ and precedent $h_{< t}^{*} \subseteq h_{< n}$ :

$V (h_{< t}^{*}, π, h_{< n}) = E_{h_{< t}^{*}}^{π} (h_{< n}) = \int_{w \in W_{h_{< t}^{*}}} U_{h_{< t}^{*}} (w) P (w | h_{< n})$

where a policy $π$ for an agent is a map from histories to a probability distribution over actions, $π : H \to Δ A$ .

And define reward function as the difference between two total expected utilities (Armstrong 2018)

Definition 5 (Reward) Reward between two histories $h_{< m} \subset h_{< n}$ for a policy $π$ and precedent $h_{< t}^{*} \subseteq h_{< n}$ is:

$R (h_{< t}^{*}, π, h_{< n}, h_{< m}) = V (h_{< t}^{*}, π, h_{< n}) - V (h_{< t}^{*}, π, h_{< m})$

Proof of Theorem 1

According to Solomonoff Induction, the probability that $w$ is the future of the precedent sequence $h^{*}$ according to all hypotheses would be:

$M (w_{< n} = h_{< n}^{*} | h_{< t}^{*}) = M (h_{< n}^{*}) / M (h_{< t}^{*})$

Where $M (h^{*})$ is a precedent's prior distribution over all possible worlds when we take all the hypotheses into account:

$M (x) \equiv \sum μ \in M^{R} Q^{- H (μ)} μ (x)$

where $μ$ is the semi-measure which assigns probabilities to hypotheses $x$ , and $M^{R}$ is the set of all recursive semi-measures, Q is the numbers of symbols in the sequences' alphabet, and $H (μ)$ is the length of the shortest program that computes $μ$ (Legg 1996).

We cannot directly use this equation to predict the future precedent, because for an agent there might be more than one possible right choices , in contrast to a sequence that has only one continuation.

Let's consider a sequence $J^{+}$ , in which a variable $j = J (h_{< k})$ is inserted to a history or world sequence every $k$ steps. For example:

$J^{+} (h_{< t}^{*}) \equiv x_{1} y_{1} 1 x_{2} y_{2} 1 . . . x_{t - 1} y_{t - 1} 1$

$J^{+} (h_{< t}) \equiv x_{1} y_{1} j_{1} x_{2} y_{2} j_{2} . . . x_{t - 1} y_{t - 1} j_{t - 1}$

for $w_{< n} \in W_{h_{< t}^{*}}$ ,

$J^{+} (w_{< n}) \equiv x_{1} y_{1} 1 x_{2} y_{2} 1 . . . x_{t - 1} y_{t - 1} 1 x_{t} y_{t} j_{t} . . . x_{n - 1} y_{n - 1} j_{n - 1}$

if $j_{n - 1} = J (w_{< n}) = 1$ (then all $j$ s from $j_{t}$ to $j_{n - 1}$ equal to 1), $w_{< n}$ would be a world of doing the right thing. Thus the problem of utility becomes the problem of sequence prediction, the utility of $w_{< n}$ is the probability of $j_{n - 1} = 1$ :

$U_{h_{< t}^{*}} (w_{< n}) = P (J^{+} (w_{< n}) \cap J (w_{< n}) = 1) / P (J^{+} (w_{< n}))$

$= P (J (w_{< n}) = 1 | J^{+} (w_{< n}))$

Then we can try to prove A goal given a precedent $h^{*}$ equals the most possible continuation of the precedent sequence:

Let $w_{< n}^{'}$ be one of $w_{< n} \in W_{h_{< t}^{*}}$ that has the highest probability to be the continuation of the precedent sequence, which means:

$\forall w_{< n} \in W_{h_{< t}^{*}}$ $P (w_{< n}^{'} | h_{< t}^{*}) \geq P (w_{< n} | h_{< t}^{*})$

and because $w_{< n} \in W_{h_{< t}^{*}}$ ,

$P (w_{< n}^{'}) \geq P (w_{< n})$

And we know all the $j$ s in $J^{+} (w_{< n}^{'})$ and $J^{+} (w_{< n})$ equal to 1, could be the output of a program of fixed length and has a fixed affect on the prior probability of a sequence, then:

$P (J^{+} (w_{< n}) \cap J (w_{< n}) = 1) = P (w_{< n}) - C_{1}$

$P (J^{+} (w_{< n})) = P (w_{< n}) - C_{2}$

and $C_{1} > C_{2}$

Then we can have:

$\forall w_{< n} \in W_{h_{< t}^{*}}$

$P (J^{+} (w_{< n}^{'}) \cap J (w_{< n}^{'}) = 1) / P (J^{+} (w_{< n}^{'})) \geq P (J^{+} (w_{< n}) \cap J (w_{< n}) = 1) / P (J^{+} (w_{< n}))$

which equals

$U_{h_{< t}^{*}} (w_{< n}^{'}) \geq U_{h_{< t}^{*}} (w_{< n})$