Curt_Welch — LessWrong

If the brain were naturally a universal learner, then surely we wouldn't have to learn universal learning (e.g. we wouldn't have to learn to overcome cognitive biases, Bayesian reasoning wouldn't be a recent discovery, etc.)? The system seems too gappy and glitchy, too full of quick judgement and prejudice, to have been designed as a universal learner from the ground up.

You are conflating the ideas of universal learning and rational thinking. They are not the same thing.

I'm a strong believer in the idea that the human intelligence emerges from a strong general purpose reinforcement learning algorithm. If that's true, then it's very consistent with our problems of cognitive bias.

If the RL idea is correct, then thinking is best understood as as a learned behavior, just like what words we speak with our lips is a learned behavior, just as how we move our arms and legs are learned behaviors. Under the principle that we are are an RL learning machine, what we learn, is ANY behavior which helps us to maximize our reward signal.

We don't learn rational behavior, we learn whatever behavior the learning system rationally has computed is what is needed to produce the most rewards. And in this care, our prime rewards are just those things which give us pleasure, and which reduce pain.

If we live in an environment that gives us rewards when we say "I believe God is real, and the Bible is to book of God, and the Earth is 10,000 years old", -- then we will say those words. We will do ANYTHING that works to maximize rewards, in our enviornment. We will not only say them, we will believe them in our core. If we are conditioned by our enviornment to believe these things, that is what we will believe.

If we live in an environment that trains us to look at the data, and make conclusions based on what the data tells us (follow the behavior of a rational scientist), when we will act that way instead.

A universal learning can learn to act in any way it needs to in order to maximize rewards.

That's what our cognitive bias is -- our brain's desire to act as our past experience as trained us, not to act rationally.

To learn to act rationally, we must carefully be trained to act rationally -- which is why the ideas of less wrong are needed to overcome our bias.

Also keep in mind that the purpose of the human brain is to control our actions -- and for controlling actions, speed is critical. Our brain is best understood not as a "thinking machine" but rather as a reaction machine -- a machine that must choose a course of action in a very short time frame (like .1 seconds) -- so that when needed, we can quickly react to an external danger that is trying to kill is -- from a bear attacking us, to a gust of wind, that almost pushed us over the edge of the cliff.

So what the brain needs to learn, as a universal learner, is an internal "program" of quick heuristics, how to respond instantly, to any environmental stimulus. We learn (universally) how to react, not how to "think rationally".

A process like thinking rationally is a large set of learned micro reactions -- one that a takes along time to assemble and perfect. To be a good rational thinker, we have to overcome all the learned reactions that have helped us in the past gain rewards, but which have been shown not to be the actions of a rational thinker. We have to help train each other, to spot false behaviors, and train the person to have only ration behaviors when we try to engage in rational behavior that is.

Most of our life, we don't need rational behavior -- we need accurate reward maximizing behavior. But when we choose to engage in a rational thought and analysis process, we want to do our best, to be rational, and not let our learned (cognitive baise) trick us into believing we are being rational, when in fact we are just reward seeking.

So, our universal learning, could be a reward maximising process, but if it is, then that explains why we have strong cognitive bias, it's not an argument against having a cognitive baise. This is because our reward function is not wired to make us maximize rationality -- it's wired to make us act anyone needed, so as to maximize pleasure and minimize pain. Only if we immerse ourselves in an environment that rewards us for rational thinking behaviors do those behavior emerge in us.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments