Fundamentally Flawed, or Fast and Frugal?

[-]Kaj_Sotala16y110

I don't, incidentally, think that our algorithms are anywhere close to optimal, but I nonetheless felt that the opposing point of view still merits a bit more attention than it has had here so far. They do have a point, even if they're not 100% correct.

[-]Gavin16y120

This could actually act as counterevidence against the claim that AI will surpass humans around the time that the processing speed of computers rivals that of the human brain.

It may be that running a non-jury-rigged rational system against the complexity of the real world requires another order of magnitude or more of processing power.

This brings up the likelihood that initial AIs will need to be jury-rigged, and will have their own set of cognitive biases.

[-]Roko16y80

a perfect Bayesian reasoner is computationally intractable, and our mental algorithms make for an excellent, possibly close to an optimal, use of the limited computational resources we happen to have available

Looking at Sandberg and Bostrom's The Wisdom of Nature: An Evolutionary Heuristic for Human Enhancement, we see that there are several reasons why the human brain's native algorithms are unlikely to be anything close to optimal, even given the limited computational resources we happen to have available inside our skulls:

Changed tradeoffs. Evoluti

... (read more)

6Kaj_Sotala16y

An important question - how changed is the environment, really? Yes, there are plenty of cases where a changed environment is obviously breaking our evolved reasoning algorithms, but I suspect many people might be overstating the difference. At the risk of falling into a purely semantic discussion, this doesn't mean the algorithms wouldn't be optimal. It just makes them optimized for some other purpose than the one we'd prefer.

[-]cousin_it16y100

An important question - how changed is the environment, really?

That's a great discussion to have. I'd say the biggest changes are that a modern person interacts with a lot of other people and receives a lot of symbolic information. Other "major" changes, like increased availability of food or better infant healthcare, look to me minor by comparison. Not sure how to weigh this stuff, though.

3[anonymous]16y

We now also have computers. I suspect the optimal evolved system in a modern environment (efficient and effective) is an idiot savant that can live long enough to spit out the source code for an AI guaranteed to increase the inclusive fitness of the genes of the host. Genetic engineering and sperm/egg donation are other modern inventions I don't think we are all exploiting to increase our fitness optimally.

6zero_call16y

One of the fundamental ways the environment has changed locally must be the level of information that we are now able to process. Namely, since writing was invented, we've been able to consume (I would suppose) far more knowledge from far more sources. But, after all, since writing is just like a mimic of speech that we were originally "designed" for, I can't imagine the modern environment is so much different for our built in algorithms for writing. And similarly for many other "modern" aspects of life. Edit: Interestingly, I suppose books and written information have essentially developed in civilization as a response to the weaknesses of the evolved brain. Thus, many of the deficiencies in our cognitive operations have actually been attacked by civilization. Insofar as the brain was not properly designed, the modern environment has largely been a source of positive, external cognitive optimization/reorganization. One might propose that the environment has actually become far less challenging in modern times; certainly I haven't had to hunt and kill for food anytime in recent memory. Now, I can live far longer, with much less (positive) stress, I can smoke and drink and damage my mind at will, I have the express ability to become morbidly obese and mentally unhealthy, and so on. I can freely read and absorb widely disseminated propaganda from sources like Hitler, in maybe the worst case scenario. Perhaps the environment has been effectively weakening our internal algorithms through this kind of under usage and exploitation, rather than through any incidental non-optimization.

5Vladimir_Nesov16y

Good point. Civilization allows to use the strengths of our native makeup more efficiently, thus instead of being "disadjusted" because of change since the EEA, in many areas we are more at home than could ever be naturally.

5Roko16y

We have to do far more very-long-term planning than in the EEA, we are protected from starvation by easy job markets and stable food sources like food shops, we have access to healthcare, both mental and physical. Most prominently, our explicit beliefs matter more for decision theory than for signalling, whereas in the EEA the opposite was true.

5Kaj_Sotala16y

As societies, perhaps. As individuals, probably not. I find it a bit odd that you mention a decreased risk of starvation at the same time as this item; needing to look forward a year or preferably several to make sure you didn't run out of food during the winter (or the winter after that) has been a major factor in the past. Even if you lived in a warm country, it seems like there would have been more long-term dangers than there are now, when we have a variety of safety networks and a much safer society. Existential risks excluded, I'm not sure if this is true.

0Roko16y

Example: deciding to study at school rather than slack off.

0Kaj_Sotala16y

Granted.

0Roko16y

Did hunter gatherers really look forward several winters ahead?

6Kaj_Sotala16y

Hunter-gatherers, possibly not, but we've had agriculture around for 10,000 years. That has been enough time for other selection effects (for instance, the persistent domestication of cattle, and the associated dairying activities, did alter the selective environments of some human populations for sufficient generations to select for genes that today confer greater adult lactose tolerance), so I'd be cautious about putting too much weight on the hunter-gatherer environment.

5Roko16y

interesting. So in fact for those adaptations that could be implemented in just 10,000/20 = 500 generations are probably more skewed towards rationality. We can probably see the difference that those 500 generations made by the differences in life outcomes between those with aboriginal Australian DNA and white European DNA.

0MichaelVassar16y

Why be needlessly inflammatory?

1Larks16y

It provides an test for the theory?

1Roko16y

hmmm well I was actually considering the point purely from an academic POV - it occurred to me that the aboriginals were a near-perfect example. But now that you point it out, I guess that comment could be construed as "in bad taste" or "racist" or something.

0Nick_Tarleton16y

Cultural differences are hard to factor out, too.

3ChristianKl16y

The fact that human reasoning isn't optimal implies in no way that the intelligently designed algorithm of Bayesian reasoning is better.

1[anonymous]16y

If you mean optimal as in "maximizing accuracy given the processing power", then yes. But in terms of "maximizing accuracy given the data", then Bayesian reasoning is optimal from the definition of conditional probability.

0ChristianKl16y

Maximizing accuracy given available processing power and available data is the core problem when it comes to finding a good decision theory. We don't ask what decision theory God should use but what decision theory humans should use. Both go and chess are NP-hard and with can't be fully processed even if you have a computer build from all atoms in the universe.

2[anonymous]16y

You're confusing optimality in terms of results and efficiency in terms of computing power with your use of "NP-hard". Something like the travelling salesman problem is NP-hard in that there's no known way to solve them beyond a certain efficiency in terms of computing power (how to do optimally on them in terms of results is easy). It doesn't apply to chess or go in that there is no known way to get optimal results no matter how much computing power you have. These are two completely different things.

2JohannesDahlstrom16y

Surely there is a known way to play chess and go optimally (in the sense of always either winning or forcing a draw). You just search through the entire game tree, instead of a sub-tree, using the standard minimax algorithm to choose the best move each turn. This is obviously completely computationally infeasible, but possible in principle. See Solved game

0Roko16y

Correct. It would be extraordinary if the algorithm that is optimal given infinite computational resource is also optimal given limited resource. I suspect that by framing this as a battle between Bayesian inference and actual evolved human algorithms, we are missing the third alternative: algorithm X, which is the optimal algorithm for decision-making given the resources and options that we have in the society that we find ourselves in.

[-]Madbadger16y50

It is worth remembering that human computation is a limited resource - we just don't have the ability to subject everything to Bayesian analysis. So, save our best rationality for what's important, and use heuristics to decide what kind of chips to buy at the grocery store.

7CronoDAS16y

I decided what college to go to by rolling a die. ;)

4billswift16y

A random choice has long been considered a good tool to prevent dithering when you have equivalently valued alternatives.

1Madbadger16y

Yeah, sometimes you don't get the tools and information you need to make the best decision until after you've made it. 8-)

3CronoDAS16y

I wasn't disappointed in my choice of college, but I was disappointed in my choice of major. (I followed my father's advice, and, in this case, although his advice sounded reasonable, it turned out to be just plain wrong.)

[-]Roko16y40

It would be extraordinary if the algorithm that is optimal given infinite computational resource is also optimal given limited resource.

I suspect that by framing this as a battle between Bayesian inference and actual evolved human algorithms, we are missing the third alternative: algorithm X, which is the optimal algorithm for decision-making given the resources and options that we have in the society that we find ourselves in.

-1quanticle16y

Well, it may be that this ideal algorithm you're looking for is NP-hard, and thus cannot ever be executed in a short amount of time over a non-trivial problem space. Have you considered the possibility that this bounded rationality model is algorithm X?

1Cyan16y

Computing time is a resource, so "optimal algorithm for decision-making given the resources... we have" rules out impractical algorithms.

[-]CronoDAS16y40

Incidentally, if I don't have a good answer to a "guessing" problem immediately, I find it faster to just Google the relevant facts than to try to struggle to find a distinction between them that I can latch onto.

As for Hamburg vs. Cologne, my recognition heuristic is more familiar with Hamburg as a city than Cologne as a city (I know Hamburg is in Germany, I suspect that Cologne is in France). On the other hand, I know that I recognize Hamburg because I often eat hamburgers, which doesn't seem like it says much about the city. Nevertheless, if ... (read more)

0Jawaka16y

the German name for Cologne is Köln

[-]Roko16y40

The demonstration that a fast and frugal satisficing algorithm won the competition defeats the widespread view that only “rational” algorithms can be accurate.

I am suspicious of work that attempts to provide evidence for a counterintuitive result in a way that could fairly obviously have been rigged. In this case, the key question is how "generic" their competition really was. It might be more convincing if arguments could be made about a plausible "real-world" distribution of problem instances, then a set of sample competitions drawn from that distribution and various decision algorithms run on those instances.

1billswift16y

There is a lot more work on this point, not all of it focused on the point. What else, for example, would you call Robert Axelrod's "Tit-for-Tat" than a "fast and frugal satisficing algorithm"? In fact, there has been enough other work on this and related points that I would not refer to as a counter-intuitive result.

3Nick_Tarleton16y

Tit-for-tat doesn't win because it's computationally efficient.

1Technologos16y

Now that would be a cool extension of Axelrod's test: include a penalty per round or per pairing as a function of algorithm length.

1[anonymous]16y

Length of source code, or running time? (How the heck did English end up with the same word for measuring time and a 3D axis?)

1Technologos16y

I had been thinking source code length, such that it would correspond to Kolmogorov complexity. Both would actually work, testing different things. And perhaps the English question makes more sense if we consider things with a fourth time dimension ;)

1Kaj_Sotala16y

Later work seems to support the notion of fast and frugal algorithms performing evenly with more complicated ones. See e.g. Fast and Frugal Heuristics: The Tools of Bounded Rationality for references to later experiments. (It's unfortunately too late here for me to write a proper summary of it, especially since I haven't read the referenced later studies.)

0Roko16y

ok, sure. So there has been some thought put in beyond "here's this one challenge that we rigged to make the fast and frugal algorithm look good!"

[-]Daniel_Burfoot16y30

The demonstration that a fast and frugal satisficing algorithm won the competition defeats the widespread view that only “rational” algorithms can be accurate.

While this demonstration is interesting in some sense, it's pretty obvious that for any algorithm one can find an example problem at which the algorithm excels. Does the paper state how many example problems were tried?

2Kaj_Sotala16y

Only this problem. Later research has studied the algorithm's performance for other problems, though.

[-]Vladimir_Nesov16y30

(Not directly related, but may be interesting to someone. )

In a certain technical sense, "satisficing" is formally equivalent to expected utility maximization. Specifically, consider an interval on a real line (e.g. the amount of money that could be made), and a continuous and monotonous utility function on that interval. Expected utility maximization for that utility function u (i.e. the choice of a random variable X with codomain in the amounts of money) is then equivalent to maximization of probability Pr(X>V), where V is a random variable ... (read more)

0Technologos16y

That utility function would have a very interesting second derivative, though... Also, the example appears to depend on simultaneous consideration of the options; with sequential consideration, might not a small standard deviation for V induce a situation where many options will have high Pr(X>V) and only EU maximization would support rejection of early options?

0Vladimir_Nesov16y

Could you say that more explicitly? What sequential consideration? Where do Pr(X>V) and EU(X) disagree, given that they are equal?

0Technologos16y

To be clear, I am not disagreeing with your analysis of the model you presented; I am arguing that satisficing and EU maximization are not equivalent in general, but rather only when certain conditions are satisfied. Imagine, for instance, that there was no uncertainty in V; then two distributions of X could both have Pr(X>V) = 1 with different EUs. I was thinking of sequential consideration as essentially introducing uncertainty about the set of possible X distributions, but on reflection it's clear that this would be inadequate by itself to change C&L's result. The above modification--or any variant where satisficing includes a threshold requirement for Pr(X>V) rather than trying to maximize that quantity--would have to be integrated to make sequential consideration matter. Finally, if V depends only on money, rather than utility, then having a utility function with positive second derivative could make EU maximizers pick an X distribution with higher mean and standard deviation than satisficers might.

0Vladimir_Nesov16y

I can't make sense of this statement.

0Technologos16y

The way you set up the model, V was a threshold of utility. Thus, anything that increased one's expected utility also increased one's expected probability of being above that threshold. If, however, V was a threshold of money (distributed, say, as N($100,$10)), then look at these two X-distributions, given a utility function U(x) = x (the case of a function with positive second derivative just makes the following more extreme): 1) 100% probability of $200 2) 90% probability of $100 and 10% probability of $2100 Expected utilities: 1) 200 2) 300 Probabilities of meeting threshold: 1) 1- the probability of being 10 standard deviations above the mean, or "very damn close to 1" 2) 90% 50% + 10% "even closer to 1 than the above" = 55% So EU-maxers will take the latter choice, where satisficers will take the former. Note that if U(x) = x^2, then the disparity is even stronger.

0Vladimir_Nesov16y

I believe you are confused, but can't pinpoint on the first look in what way exactly. V is not "threshold of utility", it is a random variable of the same kind as X. I don't see what you mean by setting V to be normally distributed and U(x)=x, given that by construction they determine each other by the rule Pr(x>V)=u(x). If you redefine the concepts, you should do so more clearly.

0Technologos16y

If the decision agent is trying to maximize the probability of its utility being greater than a draw from the random variable V (where V is specified in utility) then it is trying to maximize the probability of being above some (yet-unknown) threshold value, no? The departure from your model that I was clarifying (unsuccessfully) in the last comment was for V to be a random variable not of utility but of money, distributed normally in this example. U(x) = x is the utility function for the EU-maxing agent, because when V is specified in money, the satisficing agent no longer needs to worry about utility. The rule you gave is only true when the satisficer defines the threshold level in terms of utility.

2Vladimir_Nesov16y

No luck. You should write everything in math, specifying types (domains/codomains) of all functions/random variables. It'll really be easier, and the confusion (mine or yours) will be instantly resolved.

0Technologos16y

Also, thanks for posting the original comment--it's actually useful to some research I'm doing, now that I actually understand it!

0Technologos16y

Ah, I think I found it. I took V to have a codomain in utilons in your example (that was my interpretation of "V is a random variable that depends only on u"). Reinterpreting the subsequent comments in that context, I can see that I was responding to "formally equivalent" in the original comment as if it meant "expected utility maximization of the traditional sort, where each outcome x is itself assigned a value by a function on x that does not involve V, will produce the same decisions as satisficing of the type described under these conditions." Interestingly, the latter may be true if V did have a codomain in utilons (or at least, I was unable to come up with a consistent counterexample).

[-]vinayak16y10

I think one thing that evolution could have easily done with our existing hardware is to at least allow us to use rational algorithms whenever it's not intractable to do so. This would have easily eliminated things such as Akrasia, where our rational thoughts do give a solution, but our instincts do not allow us to use them.

2wedrifid16y

It tried that with your great^x uncle. But he actually spent his time doing the things he said he wanted to do instead of what was best for him in his circumstances and had enough willpower to not cheat on his mate with the girls who were giving him looks.

[-][anonymous]16y10

Heh, this reminds me of something I saw a while ago. http://plover.net/~bonds/shibboleths.html

[-]Madbadger16y10

Here is an example of an amusing "Fast and Frugal" heuristic for evaluating claims with a lot of missing knowledge and required computation: http://xkcd.com/678/

[-]zero_call16y10

Outstanding post and clearly written. I'd like to see more posts of this nature on here. The results definitely seem to make sense, and seem pleasing to my intuition, but I feel kind of skeptical about such a simplified account of the cognitive process. I suppose you have to start somewhere though, and I'm not really at all familiar with this kind of science.

From personal experience, encountering a lot of excellent mathematicians in University, I have often felt that some of the best mathematicians are people who simply have the best computational resource... (read more)

0Kaj_Sotala16y

We had a university course on heuristics and biases a while back. This article was among the required reading. (Unfortunately I couldn't make the time to complete that course back then, so I'm only now reading the articles.)

[-]PhilGoetz16y10

This point is important if one is constructing a theory about how future AIs will think, and assumes that they will reach Aumann agreement because they are Bayesians.

[-]CronoDAS16y10

The "recognition heuristic" tends to work surprisingly well for stock picking, or so I've heard.

Find a bunch of "ordinary" people who have no special knowledge of stock picking, give them a list of companies, and ask them to say which ones they've heard of. Stocks of companies people have heard of tend to do better than stocks that people haven't heard of.

4Alicorn16y

My guess is that this would stop working if you did it with people who do have special stock related knowledge. The recognition heuristic is most effective when you've heard of comparatively few things. For instance, on the "which city is bigger" test, Germans were shown to do better for American cities than Americans did, and vice-versa, because Americans are more likely than Germans to have heard of small American cities and vice-versa.

2Bo10201016y

Note also that "tend to do better than" does not mean "tend to outperform the market as a whole," an important point.

2magfrump16y

for any well-defined sense of "tend to do better than" it has to, otherwise it isn't tending to do better. (since any stock someone has heard of is "tending to do better than" the set of stocks people haven't heard of) Unless the statement was intended to be "stocks of companies people have heard of tend to do better than stocks of SIMILAR COMPANIES people haven't heard of."

1CronoDAS16y

Indeed.

0ABranco16y

You're right. It works better if the group interviewed is composed of neither experts nor completely isolated news-averse schizoids.

[-][anonymous]13y00

There is a very clear cluster of people working in cognitive science with bayesian and machine learning savvy, centered around Tenenbaum, Griffiths, Kemp, Goodman, Chater, Oaksley, Perfors, Steyvers, et cetera. They often coauthor papers and have something of a unified perspective on The Way to do things (more unified and more coauthory even restricting the field to other bayesian and machine learning savvy folk, like Hinton, Gigerenzer, Friston, MD Lee). It seems like they should have a name. Tengrikemgoochoakpersteyvetcet perhaps? But then, perhaps not.

A... (read more)

[This comment is no longer endorsed by its author]Reply

[-][anonymous]14y00

What if the question required picking the smaller city? Then, if you've only heard of one, it would seem you should pick the unknown city, as you are more likely to know of larger than smaller cities. Doesn't the take the best algorithm, by specifying taking the one you know as a general fast-and-frugal tactic, lead you astray? Do you know whether subjects still choose the known city?

[-][anonymous]16y00

Just a question for MWI advocates.

If this world W1 has a parallel world W2, which has a parallel world W3, and which W1 hasn't - this is the very difference between W1 and W2 - is the W3 second order parallel to us?

[-]ChristianKl16y00

There'a no person who plays chess on a good level while employing Bayesian reasoning.

In Go Bayesian reasoning performs even worse. A good Go player makes some of his move simply because he appreciate their beauty and without having "rational" reasons for them. Our brain is capable of doing very complex pattern matching that allows the best humans to be better at a large variety of tasks than computers which use rule based algorithms.

2MichaelVassar16y

In chess or go idealized Bayesians just make the right move because they are logically omniscient.

3wedrifid16y

Logical omniscience comes close to the perfect move but understanding the imperfections of the opponent can alter what the ideal move is slightly. This requires prior information that can not be derived logically (from the rules of the game).

2Nick Hay16y

Idealized Bayesians don't have to be logically omniscient -- they can have a prior which assigns probability to logically impossible worlds.

-2ChristianKl16y

If you argue that Bayesianism is only a good way to reason when you are omniscient and a bad idea for people who aren't omniscient I can agree with your argument. If you are however omniscient you don't need much decision theory anyway.

4[anonymous]16y

There's a bit of a difference between logical omniscience and vanilla omniscience: with logical omniscience, you can perfectly work out all the implications of all of the evidence you find, and with the other sort, you get to look a printout of the universe's state.

-2ChristianKl16y

But you don't have any of those in the real world and therefore they shouldn't factor into a discussion about effective decision making strategies.

0[anonymous]16y

You'll never find perfect equality in the real world, so let's abandon math.

-3ChristianKl16y

You will never find evidence for the existence of God, so let's abandon religion...

1Richard_Kennaway16y

Yes! Already did!

-1ChristianKl16y

Where's the difference between believing in nonexistent logical omniscience and believing in nonexistent Gods?

0[anonymous]16y

I'd imagine Deep Blue is more approximately Bayesian that a human (search trees vs. giant crazy neural net).

2Nick_Tarleton16y

I think you mean "cleanly constructed" or something like that. Minimax search doesn't deal with uncertainty at all, whereas good human chess players presumably do so, causally model their opponents, and the like.

[-]brazil8416y-10

It seems to me that the problems with human rationality really start to come out when our sense of self is somehow on the line.

It's one thing to guess at which of two foreign cities is bigger. It's another to guess at which child is smarter -- our own child or or somone else's.

So perhaps we as humans have hardware and software which is pretty good, except that we sometimes use our brainpower to fool ourselves.

LESSWRONG
LW

LESSWRONG
LW

49

Fundamentally Flawed, or Fast and Frugal?

49

49