Winning is Hard

by whpearson1 min read3rd Apr 200911 comments

-10

Decision Theory
Personal Blog

Let us say you are playing Steve Omohundro's meal choosing game 1, however the negatives are a bit harsher and more realistic than just a dodgy soufle. You are given two choices on the menu, oysters and fugu. Your goal avoid death, sickness and eat tasty food. You don't know much about either, although you do know that shellfish has made you ill in the past so you give it a lower expected utility (pretend you don't know what fugu is).

Eating the poorly prepared fugu kills you dead every time, do not pass go, do not update your utility values of choosing an option (although the utility of it would be 0, if you were allowed to update). Eating oysters gives you a utility of 1.

So how do we win in this situation? In a way it is easy: Don't eat the fugu! But by what principled fashion should you choose not to eat the fugu? Microeconomics is not enough, with negative expected utility from shellfish you would pick the fugu! Also you do not get to update your utilities when you  eat the fugu, so your expected utilities can't converge with experience. So we are in a bit of a pickle.

Can humans solve these kinds of problems, if so how do we do it? The answer is poorly, in a patch work fashion and we get information on the fugu type problems from our genome and culture. For example we avoid bitter things, are scared of snakes and are careful if we are up high are because our ancestors had to have had thes bits of information (and more) to avoid death. They got them by chance, which isn't exactly principled. But all these are still needed for winning. We can also get the information culturally, but that can leave us open to taboos against harmless things such as eating pork, which we might be foolish to test ourselves. It is hardly principled either.

So in this kind of scenario it is not sufficient to be economically rational to win, you have to have a decent source of knowledge. Getting a decent source of knowledge is hard.

1 See the appendix of the Nature of Self-Improving Artificial Intelligence   starting page 37

-10

11 comments, sorted by Highlighting new comments since Today at 8:06 AM
New Comment

It seems that all you're saying is that we need information in order to make good decisions. I really don't think that's a controverted or remotely necessary point here.

In situations where there is not enough information to work from (even given perfect Bayesian updating), of course rationalists can make the wrong decision. But across the spectrum of similar possible decisions (where fugu is replaced with some other dish you've never tried), making rational use of what info you have should result in a positive expectation.

I'm making a few points,

1) We need information not to die horribly.

2) Not all utility functions are equal in determining how effective you are in the real world. If you had a utility function that disliked fugu (by its nature not because you expected it to lead to death) then you would do better. For example pebble sorters might all die out because they don't have the knowledge that eating is required to function and thus be able to sort more pebbles which is good. Lacking that knowledge they might all die out. Evolved Utility functions provide information about the world. We instinctively chew things and find chewing things that taste nice and fill our bellies good, so we don't die out.

These have implications on what we should expect of AI and how powerful we should expect bayesian updating to be.

As for the last line, I find That Alien Message pretty persuasive. We have plenty of data that we just lack the capacity to process efficiently, either because of bias (i.e. estimating the real-world effectiveness of political policies always turns into a game of selective reporting and analogy) or because of our limited hardware (i.e. protein folding; we have the sequences and the basic interactions, but we can't currently see the patterns in the data even well enough to pawn off the real work to our computers at any level beyond brute-force). A Bayesian AI with superior hardware would have more than enough data already to crack these basic problems.

Is protein folding a basic problem? People have suggested it is akin to being NP complete.

With regards to political policies, how will the AGI know which of the political data is good data for basing political planning. Also I don't think we have enough data for making policy to improve what really matters, We probably have enough make policies to improve GDP or somesuch, but enough to make sure that father's can spend enough time with their kids to provide a decent role model, I'm sceptical.

Is protein folding a basic problem? People have suggested it is akin to being NP complete.

I'm not qualified to judge on protein folding; but it seems extraordinarily likely that among all of the problems that currently appear too difficult for us, some of them can be solved quickly with better Bayesian processing of the mounds of data we currently have.

The great advances we've made, we've made by narrowing down the search space of hypothesis in an intuitive (quasi-Bayesian) manner; but certain things, like computer code, contain patterns our intuition isn't optimized to grok. (That's why programming languages help so much by letting us apply the verbal intuition we already have; but even so, we lose track of what goes where after only a very few steps and recursions.) That very general block to optimization is why I suspect a better Bayesian could vastly outperform the best current intuitive approaches.

I have to echo orthonormal: information, if processed without bias [availability bias, for example], should improve our decisions, and getting information is not always easy. I don't see how this raises any questions about the rational process, or as you say, principled fashion.

"But by what principled fashion should you choose not to eat the fugu?"

This seems like a situation where the simplest expected value calculation would give you the 'right' answer. In this case, the expected value of eating oysters is 1, the expected value of eating the fugu is the expected value of eating an unknown dish, which you'd probably base on your prior experiences with unknown dishes offered for sale in restaurants of that type. [I assume you'd expect lower utility in some places than others.] In this case, that would kill you, but that is not a failure of rationality.

In a situation without the constraints of the example, research on fugu would obviously provide you with the info you need. A web-enabled phone and google would provide you with everything you need to know to make the right call.

Humans actually solve this type of problem all the time, though the scales are perhaps less. A driver on a road trip may settle for low-quality food [a fast food chain, representing the oysters] for the higher certainty of his expected value [convenience, uniform quality]. It's simply the best use of available information.

Sorry I wasn't clear, the expected value of oysters is not 1, that is the value you discover after eating. It is unknown, you haven't had it before either. You have had other shell fish which have been dodgy.

Whether getting killed by fugu is a failure of rationality or not, it is a failure. It is not hitting a small target in the optimization space.

If you want modern examples of these sorts of problems not solvable by web phone, it is things like should we switch on the LHC, or create AI.

Whpearson----I think I do see some powerful points in your post that aren't getting fully appreciated by the comments so far. It looks to me like you're constructing a situation in which rationality won't help. I think such situations necessarily exist in the realm of platonic possibility. In other words, it appears you provably cannot always win across all possible math structures; that is, I think your observation can be considered one instance of a no free lunch theorem.

My advice to you is that No Free Lunch is a fact and thus you must deal with it. You can't win in all worlds, but maybe you can win in the world you're in (assuming it's not specially designed to thwart your efforts; in which case, you're screwed). So just because rationality has limits, does not mean you shouldn't still try to be rational. (Though also note I haven't proven that one should be rational by any of the above).

Eli addressed this dilemma you're mentioning in passing the recursive buck and elsewhere on overcoming bias)

My point is slightly different from NFL theorems. They say if you exhaustively search a problem then there are problems for the way you search that mean you will find the optimum last.

I'm trying to say there are problems where exhaustive search is something you don't want to do. E.g. seeing what happens when you stick a knife into your heart or jumping into a bonfire. These problems also exist in real life, where as the NFL problems are harder to make the case that they exist in real life for any specific agent.

Wh- I definitely agree the point you're making about knives etc., though I think one intepretation of the nfl as applying not to just to search but also to optimization makes your observation an instance of one type of nfl. Admittedly, there are some fine print assumptions that I think go under the term "almost no free lunch" when discussed.

Can humans solve these kinds of problems, if so how do we do it?

You could run an experiment. Dose a bunch of lab animals to determine the LD50, or run Ames Tests and that sort of thing. Pork is demonstrably safe, fugu may not be.