84

LESSWRONG
LW

83
Personal Blog

6

Biased AI heuistics

by Stuart_Armstrong
14th Sep 2015
1 min read
6

6

Personal Blog

6

Biased AI heuistics
0DanArmak
0Stuart_Armstrong
0Houshalter
0Stuart_Armstrong
0[anonymous]
4Stuart_Armstrong
New Comment
6 comments, sorted by
top scoring
Click to highlight new comments since: Today at 5:40 AM
[-]DanArmak10y00

An idealized or fully correct agent's behavior is too hard to predict (=implement) in a complex world. That's why you introduce the heuristics: they are easier to calculate. Can't that be used to also make them easier to predict by a third party?

Separately from this, the agent might learn or self-modify to have new heuristics. But what does the word "heuristic" mean here? What's special about it that doesn't apply to all self modifications and all learning models, if you can't predict their behavior without actually running them?

Reply
[-]Stuart_Armstrong10y00

Can't that be used to also make them easier to predict by a third party?

Possibly. we need to be closer to the implementation for this.

Reply
[-]Houshalter10y00

Does it matter if we aren't able to recognize it's biases? Humans are able to function with biases.

We are also are able to recognize and correct for their own biases. And we can't even look at, let alone rewrite, our own source code.

Reply
[-]Stuart_Armstrong10y00

I'm assuming that it can function at high level despite/because of its biases. And the problem is not that it might not work effectively, but that our job of ensuring it behaves well just got harder, because we just got worse at predicting its decisions.

Reply
[-][anonymous]10y00

If we programmed it with human heuristics, wouldn't we assume that it would have similar biases?

Reply
[-]Stuart_Armstrong10y40

We may not have programmed these in at all - it could just be efficient machine learning. And even if if started with human heuristics, it might modify these away rapidly.

Reply
Moderation Log
More from Stuart_Armstrong
View more
Curated and popular this week
6Comments

Heuristics have a bad rep on Less Wrong, but some people are keen to point out how useful they can sometimes be. One major critique of the "Superintelligence" thesis, is that it presents an abstract, Bayesian view of intelligence that ignores the practicalities of bounded rationality.

This trend of thought raises some other concerns, though. What if we could produce an AI of extremely high capabilities, but riven with huge numbers of heuristics? If these were human heuristics, then we might have a chance of of understanding and addressing them, but what if they weren't? What if the AI has an underconfidence bias, and tended to chance its views too fast? Now, that one is probably quite easy to detect (unlike many that we would not have a clue about), but what if it wasn't consistent across areas and types of new information?

In that case, our ability to predict or control what the AI does may be very limited. We can understand human biases and heuristics pretty well, and we can understand idealised agents, but differently biased agents might be a big problem.