zulupineapple - LessWrong

An Orthodox Case Against Utility Functions

Maybe I should just let you tell me what framework you are even using in the first place.

I'm looking at the Savage theory from your own https://plato.stanford.edu/entries/decision-theory/ and I see U(f)=∑u(f(si))P(si), so at least they have no problem with the domains (O and S) being different. Now I see the confusion is that to you Omega=S (and also O=S), but to me Omega=dom(u)=O.

Furthermore, if O={o0,o1}, then I can group the terms into u(o0)P("we're in a state where f evaluates to o0") + u(o1)P("we're in a state where f evaluates to o1"), I'm just moving all of the complexity out of EU and into P, which I assume to work by some magic (e.g. LI), that doesn't involve literally iterating over every possible S.

We can either start with a basic set of "worlds" (eg, ) and define our "propositions" or "events" as sets of worlds <...>

That's just math speak, you can define a lot of things as a lot of other things, but that doesn't mean that the agent is going to be literally iterating over infinite sets of infinite bit strings and evaluating something on each of them.

By the way, I might not see any more replies to this.

An Orthodox Case Against Utility Functions

zulupineapple9moΩ010

A classical probability distribution over with a utility function understood as a random variable can easily be converted to the Jeffrey-Bolker framework, by taking the JB algebra as the sigma-algebra, and V as the expected value of U.

Ok, you're saying that JB is just a set of axioms, and U already satisfies those axioms. And in this construction "event" really is a subset of Omega, and "updates" are just updates of P, right? Then of course U is not more general, I had the impression that JB is a more distinct and specific thing.

Regarding the other direction, my sense is that you will have a very hard time writing down these updates, and when it works, the code will look a lot like one with an utility function. But, again, the example in "Updates Are Computable" isn't detailed enough for me to argue anything. Although now that I look at it, it does look a lot like the U(p)=1-p("never press the button").

events (ie, propositions in the agent's internal language)

I think you should include this explanation of events in the post.

construct 'worlds' as maximal specifications of which propositions are true/false

It remains totally unclear to me why you demand the world to be such a thing.

I'm not sure why you say Omega can be the domain of U but not the entire ontology.

My point is that if U has two output values, then it only needs two possible inputs. Maybe you're saying that if |dom(U)|=2, then there is no point in having |dom(P)|>2, and maybe you're right, but I feel no need to make such claims. Even if the domains are different, they are not unrelated, Omega is still in some way contained in the ontology.

I agree that we can put even more stringent (and realistic) requirements on the computational power of the agent

We could and I think we should. I have no idea why we're talking math, and not writing code for some toy agents in some toy simulation. Math has a tendency to sweep all kinds of infinite and intractable problems under the rug.

An Orthodox Case Against Utility Functions

zulupineapple9moΩ010

Answering out of order:

<...> then I think the Jeffrey-Bolker setup is a reasonable formalization.

Jeffrey is a reasonable formalization, it was never my point to say that it isn't. My point is only that U is also reasonable, and possibly equivalent or more general. That there is no "case against" it. Although, if you find Jeffery more elegant or comfortable, there is nothing wrong with that.

do you believe that any plausible utility function on bit-strings can be re-represented as a computable function (perhaps on some other representation, rather than bit-strings)?

I don't know what "plausible" means, but no, that sounds like a very high bar. I believe that if there is at least one U that produces an intelligent agent, then utility functions are interesting and worth considering. Of course I believe that there are many such "good" functions, but I would not claim that I can describe the set of all of them. At the same time, I don't see why any "good" utility function should be uncomputable.

I think there is a good reason to imagine that the agent structures its ontology around its perceptions. The agent cannot observe whether-the-button-is-ever-pressed; it can only observe, on a given day, whether the button has been pressed on that day. |Omega|=2 is too small to even represent such perceptions.

I agree with the first sentence, however Omega is merely the domain of U, it does not need to be the entire ontology. In this case Omega={"button has been pressed", "button has not been pressed"} and P("button has been pressed" | "I'm pressing the button")~1. Obviously, there is also no problem with extending Omega with the perceptions, all the way up to |Omega|=4, or with adding some clocks.

We could expand the scenario so that every "day" is represented by an n-bit string.

If you want to force the agent to remember the entire history of the world, then you'll run out of storage space before you need to worry about computability. A real agent would have to start forgetting days, or keep some compressed summary of that history. It seems to me that Jeffrey would "update" the daily utilities into total expected utility; in that case, U can do something similar.

I can always "extend" a world with an extra, new fact which I had not previously included. IE, agents never "finish" imagining worlds; more detail can always be added

You defined U at the very beginning, so there is no need to send these new facts to U, it doesn't care. Instead, you are describing a problem with P, and it's a hard problem, but Jeffrey also uses P, so that doesn't solve it.

> ... set our model to be a list of "events" we've observed ...
I didn't understand this part.

If you "evaluate events", then events have some sort of bit representation in the agent, right? I don't clearly see the events in your "Updates Are Computable" example, so I can't say much and I may be confused, but I have a strong feeling that you could define U as a function on those bits, and get the same agent.

This is an interesting alternative, which I have never seen spelled out in axiomatic foundations.

The point would be to set U(p) = p("button has been pressed") and then decide to "press the button" by evaluating U(P conditioned on "I'm pressing the button") * P("I'm pressing the button" | "press the button"), where P is the agent's current belief, and p is a variable of the same type as P.

Superintelligence FAQ

zulupineapple9mo-11

If you actually do want to work on AI risk, but something is preventing you, you can just say "personal reasons", I'm not going to ask for details.

I understand that my style is annoying to some. Unfortunately, I have not observed polite and friendly people getting interesting answers, so I'll have to remain like that.

Superintelligence FAQ

zulupineapple9mo-21

OK, there are many people writing explanations, but if all of them are rehashing the same points from Superintelligence book, then there is not much value in that (and I'm tired of reading the same things over and over). Of course you don't need new arguments or new evidence, but it's still strange if there aren't any.

Anyone who has read this FAQ and others, but isn't a believer yet, will have some specific objections. But I don't think everyone's objections are unique, a better FAQ should be able to cover them, if their refutations exist to begin with.

Also, are you yourself working on AI risk? If not, why not? Is this not the most important problem of our time? Would EY not say that you should work on it? Could it be that you and him actually have wildly different estimates of P(AI doom), despite agreeing on the arguments?

As for Raemon, you're right, I probably misunderstood why he's unhappy with newer explanations.

Superintelligence FAQ

zulupineapple9mo10

Stampy seems pretty shallow, even more so than this FAQ. Is that what you meant by it not filling "this exact niche"?

By the way, I come from AGI safety from first principles, where I found your comment linking to this. Notably, that sequence says "My underlying argument is that agency is not just an emergent property of highly intelligent systems, but rather a set of capabilities which need to be developed during training, and which won’t arise without selection for it." which is reasonable and seems an order of magnitude more conservative than this FAQ, which doesn't really touch the question of agency at all.

Why We Disagree

zulupineapple9mo10

I'm talking specifically about discussions on LW. Of course in reality Alice ignores Bob's comment 90% of the time, and that's a problem in it's own right. It would be ideal if people who have distinct information would choose to exchange that information.

I picked a specific and reasonably grounded topic, "x-risk", or "the probability that we all die in the next 10 years", which is one number, so not hard to compare, unless you want to break it down by cause of death. In contrived philosophical discussions, it can certainly be hard to determine who agrees on what, but I have a hunch that this is the least of the problems in those discussions.

A lot of things have zero practical impact, and that's also a problem in it's own right. It seems to me that we're barely ever having "is working on this problem going to have practical impact?" type of discussions.

Superintelligence FAQ

zulupineapple9mo-30

I want neither. I observe that Raemon cannot find an up to date introduction that he's happy with, and I point out that this is really weird. What I want is an explanation to this bizarre situation.

Is your position that Raemon is blind, and good, convincing explanations are actually abundant? If so, I'd like to see them, it doesn't matter where from.

Superintelligence FAQ

zulupineapple9mo10

"The world is full of adversarial relationships" is pretty much the weakest possible argument and is not going to convince anyone.

Are you saying that MIRI website has convincing introductory explanation of AI risk, the kind that Raemon wishes he had? Surely he would have found them already? If there aren't, then, again, why not?

Superintelligence FAQ

zulupineapple9mo30

If our relationship to them is adversarial, we will lose. But you also need to argue that this relationship will (likely) be adversarial.

Also, I'm not asking you to make the case here, I'm asking why the case is not being made on front page of LW and on every other platform. Would that not help with advocacy and recruitment? No idea what "keeping up with current events" means.

LESSWRONG
LW

Posts

Wiki Contributions

Comments