Universal agents and utility functions

[-]Kawoomba13y50

How do your proposed changes to the utility function affect e.g. the proof of AIXI's pareto optimality (allured to in pp. 30 in the paper) and other provable properties of AIXI?

[-]Anja13y40

I am quite sure that pareto optimality is untouched by the proposed changes, but I haven't written down a proof yet.

[-]timtyler13y20

Is there a way to bind the optimization process to actual patterns in the environment? To design a framework in which the screen informs the agent about the patterns it should optimize for? The answer is, yes, we can just define a utility function that assigns a value to every possible future history and use it to replace the reward system in the agent specification [...]

If only the problem was that easy. Telling an agent to optimise a utility function over external world states - rather than a reward function - gets into the issue of how you tell a machine the difference between real and apparent utility - when all they have to go on is sensory data.

It isn't easy to get this right when you have a superintelligent agent working to drive a wedge between your best efforts, and the best possible efforts.

[-]timtyler13y20

I don't think infinitessimal utilities are really Nick Bostrom's idea. To quote from me in 2009:

As I think I already mentioned, if you use surreal numbers to represent utility, you don't need to do any discounting - since then you can use infinite (and infinitessimal) numbers - and they can represent the concept that no amount of A is worth B just fine. The need for surreal numbers in decision theory was established by Conway over three decades ago, in his study of the game of go.

[-]lukeprog13y-10

But early versions of Bostrom's "Infinite Ethics" paper have been online since at least May 2004.

[-]timtyler13y00

Er, I wasn't trying to take credit. I was saying the idea dates back to Conway. Decision theory was the motivation behind the invention of surreal numbers in the first place.

[-]Eliezer Yudkowsky13y30

Game theory motivated surreal numbers. Game theory != decision theory.

[+]timtyler13y-60

[-]lukeprog13y00

Gotcha.

[-]Manfred13y10

Could you explain more about why you're down on agent 1, and think agent 2 won't wirehead?

My first impression is that agent 1 will take its expected changes into account when trying to maximize the time-summed (current) utility function, and so it won't just purchase options it will never use, or similar "dumb stuff." On the other topic, the only way agent 2 can't wirehead is if there's no possible way for it to influence its likely future utility functions - otherwise it'll act to increase the probability that it chooses big, easy utility functions, and then it will choose same big easy utility functions, and then it's wireheaded.

[-]Anja13y00

I am pretty sure that Agent 2 will wirehead on the Simpleton Gambit, depending heavily on the number of time cycles to follow, the comparative advantage that can be gained from wireheading and the negative utility the current utility function assigns to the change.

Agent 1 will have trouble modeling how its decision to change its utility function now will influence its own decisions later, as described in AIXI and existential despair. So basically the two futures look very similar to the agent except that for the part where the screen says something different and then it all comes down to whether the utility function has preferences over that particular fact.

[-]Manfred13y10

Agent 1 will have trouble modeling how its decision to change its utility function now will influence its own decisions later,

Ah, right, that abstraction thing. I'm still fairly confused by it. Maybe a simple game will help see what's going on.

The simple game can be something like a two-step choice. At time T1, the agent can send either A or B. Then at time T2, the agent can send A or B again, but its utility function might have changed in between.

For the original utility function, our payoff matrix looks like AA: 10, AB: -1, BA: 0, BB: 1. So if the utility function didn't change, the agent would just send A at time T1 and A at time T2, and get a reward of 10.

But suppose in between T1 and T2, a program predictably changes the agent's payoff matrix, as stored in memory, to AA: -1, AB: 10, BA: 0, BB: 1. Now if the agent sent A at time T1, it will send B at time T2, to claim the new payoff for AB of 10 units. Even though AB is lowest on the preference ordering of the agent at T1. So if our agent is clever, it sends B at time T1 rather than A, knowing that the future program will also pick B, leading to an outcome (BB, for a reward of 1) that the agent at T1 prefers to AB.

So, is our AIXI Agent 1 clever enough to do that?

[-]Anja13y00

I would assume that it is not smart enough to forsee its own future actions and therefore dynamically inconsistent. The original AIXI does not allow for the agent to be part of the environment. If we tried to relax the dualism then your question depends strongly on the approximation to AIXI we would use to make it computable. If this approximation can be scaled down in a way such that it is still a good estimator for the agent's future actions, then maybe an environment containing a scaled down, more abstract AIXI model will, after a lot of observations, become one of the consistent programs with lowest complexity. Maybe. That is about the only way I can imagine right now that we would not run into this problem.

[-]Manfred13y00

Thanks, that helps.

[-]timtyler13y10

Agent 1 will have trouble modeling how its decision to change its utility function now will influence its own decisions later, as described in AIXI and existential despair.

Be warned that that post made practically no sense - and surely isn't a good reference.

[-]AlexMennen13y00

Comparing the three proposed agents, we notice that Agent 1 is dynamically inconsistent: it will optimize for future opportunities, that it predictably will not take later.

This seems so flawed as to be pretty much useless. Specification for an agent that optimizes for its current utility function under the knowledge that its utility function will change:

First, replace the action-perception sequence with an action-perception-utility sequence u1,y1,x1,u2,y2,x2,etc. Let the action-generating function be represented by action(k), where k is the step. This will make use of a recursive helper function modeled_action(n, k), representing what it thinks it will do in the future, where n-k is the number of steps forward it looks.

action(k) = modeled_action(m_k, k).

modeled_action(k, k) = argmax(y_k) u_k(yx_<k, yx_k)*M(uyx_<k, uyx_k)

for n>k: modeled_action(n, k) = argmax(y_k) u_k(yx_k.

Apologies for the lack of LaTeX.

[-]Anja13y00

First, replace the action-perception sequence with an action-perception-utility sequence u1,y1,x1,u2,y2,x2,etc.

This seems unnecessary. The information u_i is already contained in x_i.

modeled_action(n, k) = argmax(y_k) uk(yx\<k, yx_k:n)*M(uyx_<k, uyx_k:n)

This completely breaks the expectimax principle. I assume you actually mean something like $\\textrm\{modeled\\\_action\}\(n,k\$ =\textrm{arg}\max_{y_k}\sum_{x_k}u_k(\.{y}\.{x}_{%3Ck}y\underline{x}_{k:n})M(\.{y}\.{x}_{%3Ck}y\underline{x}_{k:n}))

which is just Agent 2 in disguise.

[-]AlexMennen13y00

Oops. Yes, that's what I meant. But it is not the same as Agent 2, because this (Agent 4?) uses its current utility function to evaluate the desirability of future observations and actions, even though it knows that it will use a different utility function to choose between them later. For example, Agent 4 will not take the Simpleton's Gambit because it cares about its current utility function getting satisfied in the future, not about its future utility function getting satisfied in the future.

Agent 4 can be seen as a set of agents, one for each possible utility function, that are using game theory with each other.

[-]Anja13y00

I second the general sentiment that it would be good for an agent to have these traits, but if I follow your equations I end up with Agent 2.

[-]AlexMennen13y30

No, you don't. If you tried to represent Agent 2 in that notation, you would get

modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.

You were using u_k to represent the utility of the last step of its input, so that total utility is the sum of the utilities of its prefixes, while I was using u_k to represent the utility of the whole sequence. If I adapt Agent 4 to your use of u_k, I get

modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.

[-]Anja13y40

I am starting to see what you mean. Let's stick with utility functions over histories of length m_k (whole sequences) like you proposed and denote them with a capital U to distinguish them from the prefix utilities. I think your Agent 4 runs into the following problem: modeled_action(n,m) actually depends on the actions and observations yx_{k:m-1} and needs to be calculated for each combination, so y_m is actually

$y\_m\(\\\.\{y\}\\\.\{x\}\_\{<k\}y\\underline\{x\}\_\{k:m\-1\}\$ )

which clutters up the notation so much that I don't want to write it down anymore.

We also get into trouble with taking the expectation, the observations x_{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx_<k,yx_k:n) even supposed to mean, where do the x's come from?

So let's torture some indices:

$\\hat\{y\}\_\{n,k\}\(yx\_\{1:n\-1\}\$ =\textrm{arg}\max_{y_n}\sum_{x_{n:m_k}}U_n(yx_{1:n}\hat{y}_{n+1,k}(yx_{1:n})x_{n+1}\dots)

$\\dots\\hat\{y\}\_\{m\_k,k\}\(yx\_\{1:m\_k\-1\}\$ x_{m_k})M(\.{y}\.{x}_{%3Ck}yx_{k:n-1}\hat{y}\underline{x}_{n:m_k}))

where n>=k and $\\\.\{y\}\_k=\\hat\{y\}\_\{k,k\}\.$

This is not really AIXI anymore and I am not sure what to do with it, but I like it.

[-]AlexMennen13y20

so y_m is actually [...] which clutters up the notation so much that I don't want to write it down anymore.

Yes.

We also get into trouble with taking the expectation, the observations x{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx<k,yx_k:n) even supposed to mean, where do the x's come from?

Oops, you are right. The sum should have been over x_{k:n}, not just over x_k.

So let's torture some indices: [...]

Yes, that is a cleaner and actually correct version what I was trying to describe. Thanks.

[-]AlexMennen13y00

It looks like AIXI is already dynamically inconsistent, since it assumes that on step k+1, it will look m_k - (k+1) steps ahead, when it will in fact look m_(k+1) - (k+1) steps ahead. I suppose if the utility of a prefix of a string is a good heuristic for the utility of the whole string, this isn't a huge problem?

[-]timtyler13y20

AIXI actually has a configurable horizon function. It's described on page 30 of AIXIgentle.

[-]Anja13y50

There is also a more detailed paper by Lattimore and Hutter (2011) on discounting and time consistency that is interesting in that context.

[-]mytyde13y-10

This is a very interesting paper. Reminds me of HIGHLANDER for some reason... those guys lived for thousands of years and weren't even rich? They hadn't usurped control of vast econo-political empires? No hundred-generations-long family of bodyguards?

[-][anonymous]13y00

I think people would get pretty antsy when it became clear that the guy running their town was an immortal. If I were a 13th century peasant with a hankering for revolt and a touch of the plague, I would do terrible, terrible things to someone who was both immortal and rich. Probably best not to get too showy.

[-]Eliezer Yudkowsky13y00

If a human line of descent can't do that, why should an immortal be able to do that?

[-]MugaSofer13y30

Consistency? And, in fairness, human lines of descent have become monarchies, which worked out pretty well for a while.

[-]Anja13y00

This generalizes to the horizon problem: If at time k you only look ahead to time step m_k but have unlimited life span you will make infinitely large mistakes.