There are some common biases that are often used to discount AI progress. We should keep these in mind, as they can prevent us from having an objective understanding of progress in the field.

I'm going to use AI here instead of ML because usually these biases are relevant to any AI technique, not just ML. But in practice, ML is usually what I'm referring to.

1. AI uses the easiest solution it can find. Often people argue that a system isn't intelligent because it did some task in a simpler way than humans would have done it. This is especially applicable if there is some simple heuristic the AI found that did well enough. If finding the word "death" in a sentence is sufficient to do perfect classification, the AI will probably only learn to do that, no matter how intelligent it is. If your test set includes a case where that rule isn't sufficient, but the training set does not, the AI will probably fail on the test set case, because it has no way of knowing that "looking for the word death" wasn't the rule you wanted. AI will only do more complicated things once that simple thing is no longer sufficient, and it'll keep around the dumb heuristics whenever they work.

2. AI is different than us. Sometimes people argue that a system isn't intelligent because it does something in a more complicated way than is necessary (or just in a different way than what humans would have done, but not necessarily more or less complicated). Just because a specific rule is intuitive to us humans doesn't mean it's the easiest for an AI to find. A more complicated rule that also works and is easier for the AI to find will be found and used, no matter how intelligent an AI is. Regularization could in theory help with this, but only if regularization pushes towards the natural "human" approach, and only if the researchers continue training it after it's performance is very good. It's also not guaranteed that the rules that seem sensible to us are as good as what the AI does, our approach could be worse.

3. Many models will be better than humans in some ways and worse in others, and which aspects those are often won't be correlated with what we view as "difficult". 

There are certain things that are easy for us to do (continuing to use the same name to refer to someone over a long story, preserving world details like the size of a horse) and things that are difficult for us to do (advanced reasoning, logic involving detailed steps, stories that involve 30 or more characters, etc.). We have a bias that the things that are easy for us to do should be easy for an AI to do, and the things that are hard for us to do should be hard for an AI to do. This leads to two important assumptions, both of which are false:

If an AI cannot do those easy things, it cannot do those difficult things.

If an AI can do those difficult things, it can easily do the easy things.

These days this bias is more apparent: visual recognition is much more difficult than playing chess, or solving complicated arithmetic expressions. But this point has deeper implications that I feel like some people miss. A model may become superhuman in all ways but one, and in that way it is still subhuman. If that subhuman aspect is something that is easy for humans to do, we will write off the model as "not intelligent". The key bias here is that our internal intuition for the difficulty of tasks should not be used to judge how intelligent something is, it can only be used to determine how "humanlike" something is in its way of thinking. We can still use the intuition as a rough guiding pole for being surprised when AI does well at some task, but the point is that intelligence is multifaceted and the ordering of tasks solved by AI is not a given. A model could be better at theorem proving than any mathematician on earth, and simultaneously struggle to be consistent about the name of a character for more than a few paragraphs.

This is a really really important point. Superintelligent agents will probably still be dumb about something for a long time, even after they are dangerous. It's unreasonable to expect them to converge to our way of thinking: they function differently, so different things are easy and hard for us than they are for them.

4. AI models can be capable of being dumb or smart depending on the prompt. Generative models need to model all of human behaviour, including the stupid mistakes we all sometimes make. I've seen this one most concisely stated as "sampling cannot be used to prove the absence of knowledge or ability, only the presence of it". If your prompt doesn't give you what you want, it's possible that PEBCAK, and consider trying a different prompt, or trying different sampling settings.

5. World knowledge isn't intelligence. A superintelligent alien that landed on earth would not know whether a horse is larger than a toaster. Testing world knowledge is an interesting task, and being able to pick up world knowledge is an important sign of intelligence. But lack of world knowledge should not be used to discount intelligence or reasoning ability. 

6. Exponential growth is counterintuitive. Somehow AI progress seems to be exponential in the same way that Moore's law is. This means that everything always seems too early to do anything about, right up until it is too late. In practice, most things are probably S curves that level off eventually, but where they level off may be far past the relevant danger points, so we should still try and keep this in mind.

7. Advertising/Celebrities. Significant progress made by small labs or independent researchers may not get nearly as much attention as progress made by big organizations or well known researchers. This is a difficult problem caused in part by the large amount of papers. In theory it could be helped by recommendation and exploration software to improve paper discovery, but either way the bias is important to keep in mind.

Let me know if you think of other biases and I'll add them to the list.

PS: When people claim that an AI is "memorizing and pattern matching" and not "truly understanding", in practice I find it usually comes down to either them referring to point 1 or 2, or when they see a model make a mistake a human wouldn't normally make (3).



2 comments, sorted by Click to highlight new comments since: Today at 7:26 AM
New Comment

This helped me to distill a lot of things that I already "knew" on a semi-verbal/intuitive level but had not put into words.  I expect it will help me discuss these concepts with people unfamiliar with the ins and outs of AI. Thank you for doing the mental work of synthesizing and then translating it into plain English with only the necessary jargon.  

Yes, as said by supposedlyfun, it's really well written and many of us would have struggled to explain it as clearly as you did. I mean, I'm constantly working with AI systems (they're already doing quite a great job to predict tendencies and future evolutions: for example, apart from the COVID crisis, an AI program, given the data for 2000-2010 gave similar results to what happened in real life with the prices of this Budapest real estate for the 2010-2020 period) and I'm always impressed at what's being achieved!

Sure, I may be a bit scared of what would happen if such programs were used with malevolent goals in mind, but it's like nuclear energy or the Internet: you can't stop progress, so you'd better make everything you can to ensure it'll make everyone's life better.

New to LessWrong?