Basically, the Orthogonality Thesis is arguing that intelligence and goals do not correlate, at least in the limit of infinite compute and data. That is, minds and goals are very separate.

Now the question is, is the Orthogonality Thesis at least approximately true in humans?

Specifically, are behaviors and goals systematically correlated with intelligence, or is there very little correlation between goals/behavior and intelligence?

Also, what implications does the Orthogonality Thesis being true or false in humans imply about X-risk and technology?

New Answer
New Comment

5 Answers sorted by

I don't know the answer, but here's two potentially related pieces of information from differential psychology:

  • Smarter people do not have nicer personality traits; they are not more compassionate. In fact general cognitive ability is mostly uncorrelated with personality.
  • Smarter people are less likely to be convicted of crimes. Differential psychologists seem to believe that this is because smarter people are less criminal, so there are fewer crimes that they can be convicted of; I have not seen any data that explicitly contradicts this assumption of smarter people being less criminal, but I also have not seen any data that explicitly supports it (rather than e.g. them being better able to avoid getting caught).

A mitigating factor for the criminality is that smarter people are usually less in need of committing crimes. Society values conventional intelligence and usually will pay well for it, so someone who is smarter will tend to get better jobs and make more money, so they won't need to resort to crime (especially petty crime).

It could also be that smarter people get caught less often, for any given level of criminality.
Additionally, if you have a problem which can be solved by either (a) crime or (b) doing something complicated to fix it, your ability to do (b) is higher the smarter you are.

Confused about something -- about smart people not being nicer. That fits with my theory of how the world works, but not with my observation of children and teenagers. The smart kids are (usually) way nicer to each other. My 12-y-o observed this as he went from (nasty) 2nd grade to (nice) gifted program to middle school, with the middle school being a mix of nicer smart kids and more poorly behaved, poorly performing students. This also matches my personal experience, my wife's experience, and what we see in pop culture.

Now, you could say smart kids just f... (read more)

I don't know. I usually hear the opposite stereotype, of smart people being edgy and mean. I wonder to what extent people's stereotypes on this is due to noise, selection bias, or similar, but it seems hard to figure out. In this specific case, I would wonder how much the true correlation is obscured by the school environment. Schooling institutions are supposed to value learning, intellect, etc.., so smart people and conformist/authority-trusting people might be hard to distinguish in schools? I don't think the USA is an outlier with respect to this, I think most differential psychology studies are done in the US.

Where can I learn more about this?

For your last question - I think there are very few, if any, implications. Humans arguably occupy an extremely tiny region in the space of possible intelligent agent designs, while the orthogonality thesis applies to the space as a whole. If it was the case that goals and intelligence were correlated in humans, I'd expect it would be more reflective of how humans happen to be distributed in that space of possible designs, and not telling us much about the properties of the space itself.

ImE, obviously not? I don't have data, but general social interactions strongly suggest that smart people are nicer.

My model of how this works also suggests they would be nicer. Like, most people are nice to at least some people, so being not-nice is either due to a belief that being not-nice makes sense, or because of lack of self-control. Both of those are probably less common among smarter people. I don't think the correlation is super strong, but it's there.

Also, I don't think you defined the orthogonality thesis correctly. Afaik, Bostrom said that any combination of intelligence and goals is possible; this is not the same as saying that they're not correlated.

AFAIK there isn’t enough correlation between intelligence and goals in humans to disprove the orthogonality thesis, or even offer significant evidence against it.

However, there’s at least one dynamic in humans that should lead to some degree of value convergence in smarter humans that wouldn’t be present in AIs: a lot of human values are the result of failure to reflect on our actual desires and model of the world. Thus, smarter people are likely to suffer less from conditions like blind faith. Blind faith and tribal ideology have more degrees of freedom than optimization of the baseline desires evolution gave us, therefore one would expect more similarity in values among smarter humans (or perhaps more rational humans? It’s a known phenomenon that there can be incredibly intelligent people with massive blind spots; Robert Aumann is the traditional example here.)

Obviously that wouldn’t be the case among AIs, as they’re not all drawn from the same distribution the way humans largely are.

There is definitely a correlation! I have a handicapped child. His goals involve snacks and entertainments. His neurotypical brother's goals involve friends and getting schoolwork done (and snacks and entertainments). My goals involve making the world a better place, relating to God, loving my family, and -- snacks and entertainments. :)

And a more severely mentally handicapped person may have a goal simply of "I don't like to be upset." I'm thinking of a particular person, but I don't know her that well.

Having a handicapped family member helps me break through some ways of thinking that seem reasonable if I assume everyone's like me. 

9 comments, sorted by Click to highlight new comments since: Today at 1:39 AM

I don't think it's fair to characterize the Orthogonality Thesis as saying that there is no correlation. Instead it is saying that there isn't a perfect correlation, or maybe (stronger version) that there isn't a strong enough correlation that we can count on superhuman AIs probably having similar-to-human values by default.

That's the main problem with the orthogonality thesis, it so vague. The thesis that there isn't a perfect correlation is extremely weak and uninteresting.

Nevertheless, some people still needed to hear it! I have personally talked with probably a dozen people who were like "But if it's so smart, won't it understand what we meant / what it's purpose is?" or "But if it's so smart, won't it realize that killing humans is wrong, and that instead it should cooperate and share the wealth?"

Yes, most people seem to reject the stronger version, they think a superintelligent AI is unlikely to kill all humans. Given the context of the original question here, this seems to be understandable: In humans, higher IQ is correlated with lower antisocial and criminal behavior and lower violence – things which we typically judge to be immoral. I agree there are good philosophical reasons supporting the strong orthogonality thesis for artificial intelligence, but I think we have so far not sufficiently engaged with the literature from criminology and IQ research, which provides evidence in the opposite direction.

It doesn't seem worth engaging with to me. Yes, there's a correlation between IQ and antisocial and criminal behavior. If anyone seriously thinks we should just extrapolate that correlation all the way up to machine superintelligence (and from antisocial-and-criminal-behavior to human-values-more-generally) & then call it a day, they should really put that idea down in writing and defend it, and in the course of doing so they'll probably notice the various holes in it.

Analogy: There's a correlation between how big rockets are and how safe rockets are. The bigger ones like Saturn 5 tend to blow up less than the smaller rockets made by scrappy startups, and really small rockets used in warfare blow up all the time. So should we then just slap together a suuuuper big rocket, a hundred times bigger than Saturn 5, and trust that it'll be safe? Hell no, that's a bad idea not worth engaging with. IMO the suggestion criminology-IQ research should make us optimistic about machine superintelligence is similarly bad for similar reasons.

I guess larger rockets are safer because more money is invested in testing them, since an explosion gets more expensive the larger the rocket is. But there seems to be no analogous argument which explains why smarter human brains are safer. It doesn't seem they are tested better. If the strong orthogonality thesis is true for artificial intelligence, there should be a positive explanation for why it is apparently not true for human intelligence.

Anecdotally I don't see much correlation between goals and intelligence in humans.

Some caveats:

There are goals only intelligent people are likely to have because they're simply not reachable (or even not understandable) for unintelligent people, such as solving Fermat's last theorem.

There are intermediary goals only unintelligent people will have because they don't realise they won't achieve their aims. I.e. an intelligent person is less likely to suggest a utopia where everyone does everything for free, because they realise its unlikely to work.

Your second caveat is the point I’m making.

I don't think most humans have clear and legible goals that could be analyzed for this correlation.  Even if we did, there's so little variance in training and environment that we wouldn't probably see much difference in goal, regardless of correlation with intelligence.  In other words, human goals are hard to measure, and overdetermined.