I have a small theory which strongly implies that getting less biased is likely to make "winning" more difficult.

Imagine some sort of evolving agents that follow vaguely Bayesianish logic. They don't have infinite resources, so they use a lot of heuristics, not direct Bayes rule with priors based on Kolmogorov complexity. Still, they employ a procedure A to estimate what the world is like based on data available, and a procedure D to make decisions based on their estimations, both of vaguely Bayesian kind.

Let's be kind to our agents and grant that for every possible data and every possible decision they might have encountered in their ancestral environment, they make exactly the same decision as an ideal Bayesian agent would. A and D have been fine-tuned to work perfectly together.

That doesn't mean that either A or D are perfect even within this limited domain. Evolution wouldn't care about that at all. Perhaps different biases within A cancel each other. For example an agent might overestimate snakes' dangerousness and also overestimate his snake-dodging skills - resulting in exactly the right amount of fear of snakes.

Or perhaps a bias in A cancels another bias in D. For example an agent might overestimate his chance of success at influencing tribal policy, what neatly cancels his unreasonably high threshold for trying to do so.

And then our agents left their ancestral environment, and found out that for some of the new situations their decisions aren't that great. They thought about it a lot, noticed how biased they are, and started a website on which they teach each other how to make their A more like perfect Bayesian's A. They even got quite good at it.

Unfortunately they have no way of changing their D. So biases in their decisions which used to neatly counteract biases in their estimation of the world now make them commit a lot of mistakes even in situations where naive agents do perfectly well.

The problem is that for virtually every A and D pair that could have possibly evolved, no matter how good the pair is together, neither A nor D would be perfect in isolation. In all likelihood both A and D are ridiculously wrong, just in a special way that never hurts. Improving one without improving the other, or improving just part of either A or D, will lead to much worse decisions, even if your idea of what the world is like gets better.

I think humans might be a lot like that. As an artifact of evolution we make incorrect guesses about the world, and choices that would be incorrect given our guesses - just in a way that worked really well in ancestral environment, and works well enough most of the time even now. Depressive realism is a special case of this effect, but the problem is much more general.

28

14 comments, sorted by Click to highlight new comments since: Today at 7:37 PM
New Comment

The way to mitigate this risk is to focus your rationalist energies on the areas where you are consistently losing.

While collateral damage is always a danger, if you're already losing you know that you're not upsetting a finely tuned balance.

But if you look at the heuristics and biases literature, there's plenty of material on decision biases and how to correct them.

And the other side of it is that A+D is not adding up to Bayesian. Sometimes it sorta works, sometimes it adds up to derangement, and of course it adds up to derangement much more often outside the ancestral environment. I don't think we really have the option of leaving both sides alone. You've just got to try and fix both sides simultaneously.

Upvoted because you touched a topic I wanted to write about, now I don't have to. :-)

Relevant links: The (Bayesian) Advantage of Youth by Clay Shirky, Generalist and specialist species on Wikipedia, efficiency-flexibility tradeoff.

On the other hand, what should we do instead of improving ourselves? Stay as we are? Self-improve in some non-Bayesian manner?

The first response to a trade-off is to optimize it; it's unlikely that each of us is at the exact right tradeoff between efficiency/flexibility. But we're trying to do that here, I think, and the returns seem minimal.

The second response should be to 'make the pie higher'. (To put this into programmer terms, once you've gotten the algorithm as good as it's going to get, you look into what micro-optimizations are available and how to throw more resources at the problem.) Optimizing the tradeoff becomes less important if we have more learning or thinking capacity. Here I think LW-types aren't doing so well. Are there really so few worthwhile cognitive prostheses that we should have had as few posts on them as we have had?

(I know of 0, but perhaps I'm wrong - perhaps someone has written a good post about spaced repetition systems, say, or has calculated out how worthwhile drugs like modafinil are - and I simply missed or've forgotten about them.)

Speaking for myself, I do my best to include this worry in my methodology for improving my decisionmaking.

When I notice things I instinctively want, or things many others seem to want, which differ from my reflective purposes, I spend some effort trying to figure out why. Many times, I'm convinced that the instinctive A and D are just optimized for a different environment and set of goals than I currently expect to have. Other times, I do in fact notice that there are benefits to the instinctive or default reaction, and I incorporate that into my improved A and D.

All of these corrections are faulty and have failure modes related to incompleteness (esp. loss over time) and incorrectness (over- or under-correction based on incomplete knowledge or wrong calculation/heuristic). But they do happen.

I think my main disagreement is with Unfortunately they have no way of changing their D. I don't think A and D are anywhere near as separate as you seem to. They seem deeply correlated to me, to the point that there may be no real distinction between them. And even if there is a distinction, similar mechanisms of change (study, planning, practice) are available to both.

As an artifact of evolution we make incorrect guesses about the world, and choices that would be incorrect given our guesses - just in a way that worked really well in ancestral environment, and works well enough most of the time even now.

Yes, well, very well indeed, more than you probably suspect, once you average over all situations. For example, look at how much information the brain is able to squeeze out of the light hitting the retina. It's able to use the cues of shadows, color, light gradients, etc., to construct a coherent, accurate picture of the world around you.

Its only "holes" (badly wrong priors) are those cases we know as "optical illusions". And as far as retinal-interpretation algorithms go, it's pretty amazing that it only returns the wrong results in these very rare, unnatural cases.

Ditto for proprioception (sensation of your body parts' orientation). You can immediately, intuitively infer where your hand is just from muscle sensations.

Or language: once you learn a language, you have a very accurate predictive model that allows you to quickly disambiguate similar sounds based on what kinds of words are likely to come next, so you never hear "touch the blue block" as "Dutch the blue blog", to use an example from a paper someone linked here once.

I confess to some surprise that this post has 18 upvotes while taw's next post has only 6 (as of right now). In my personal ranking, those numbers should be switched: the other post has a raft of excellent data, whereas this post is basically unsupported theorizing, however intriguing. (Not to say that support doesn't exist somewhere, just not in the post where it could do some good.)

Humans can effectively change the decision theory they use. They can educate themselves, modify themselves, and they can pre-process their sensory inputs and post-process their motor outputs using machines and computers. The process is known as intelligence augmentation. So: in conclusion, the subject line is misleading.

and a procedure D to make decisions based on their estimations

so D is a utility function, or something like a utility function? But then how can D be "wrong"?

By being internally inconsistent, and only saved by your mistakes in A.

For example it can be argued that proper D should treat risk of dying in all possible ways the same way. If person's D considers dying of shark attack worse than dying of infection (given similar level of suffering etc), and their A has completely wrong idea of how likely shark attacks and infections are, they might take precautions about sharks and infections that are exactly correct. If they find out what A is really like, and start using it, their decisions suddenly become inconsistent.

Of course you can argue from fundamentalist position that utility function is "never wrong", but if you can be trivially Dutch booked, or have ridiculously inconsistent preferences between essentially equivalent outcomes (like dying), then it's "wrong" as far as I'm concerned.

For example it can be argued that proper D should treat risk of dying in all possible ways the same way

It is not logically inconsistent to prefer dying in one way over another.

If you buy into fundamentalist interpretations of utility functions then it's not. If you don't, then it is - to me there should be some difference in something "meaningful" for there to be difference in preferences, otherwise it's not a good utility function.

Even with fundamentalist interpretation you get known inconsistencies with probabilities, so it doesn't save you.

I think that the strongest critique of D is that most people choose things that they later honestly claim were not "what they actually wanted", i.e. D acts something like a stable utility function D_u with a time and mood dependent error term D_error added to it. It causes many people much suffering that their own actions don't live up to the standards of what they consider to be their true goals.

Probabilistic inconsistencies in action are probably less of a problem for humans, though not completely absent.

Even more to the point, imagine D to be split into two parts, a utility function and a goal-seeking function. Then even if the utility function is never "wrong," per se, the goal-seeking function could suboptimally use A to pursue the goals. Our D-functions routinely make poor decisions of the second sort, e.g. akrasia.