If rationality means optimizing expected selfish utility

This is a convenient word swap. Simplifying slightly, and playing a little taboo, we get:

"If you have a strictly selfish utility function, and you have a system of thinking that is especially good at satisfying this function, people will never trust you where your interests may coincide."

Well, yes. Duh.

But if people actually liked your utility function, they'd want you to be more, not less, rational. That is, if both my lover and I value each others' utility about as much as our own, we... (read more)

Rationalists lose when others choose

by PhilGoetz 4 min read16th Jun 200958 comments


At various times, we've argued over whether rationalists always win.  I posed Augustine's paradox of optimal repentance to argue that, in some situations, rationalists lose.  One criticism of that paradox is that its strongest forms posit a God who penalizes people for being rational.  My response was, So what?  Who ever said that nature, or people, don't penalize rationality?

There are instances where nature penalizes the rational.  For instance, revenge is irrational, but being thought of as someone who would take revenge gives advantages.1

EDIT:  Many many people immediately jumped on this, because revenge is rational in repeated interactions.  Sure.  Note the "There are instances" at the start of the sentence.  If you admit that someone, somewhere, once faced a one-shot revenge problem, then cede the point and move on.  It's just an example anyway.

Here's another instance that more closely resembles the God who punishes rationalism, in which people deliberately punish rational behavior:

If rationality means optimizing expected utility, then both social pressures and evolutionary pressures tend, on average, to bias us towards altruism.  (I'm going to assume you know this literature rather than explain it here.)  An employer or a lover would both rather have someone who is irrationally altruistic.  This means that, on this particular (and important) dimension of preference, rationality correlates with undesirability.2

<ADDED>: I originally wrote "optimizing expected selfish utility", merely to emphasize that an agent, rational or not, tries to maximize its own utility function.  I do not mean that a rational agent appears selfish by social standards.  A utility-maximizing agent is selfish by definition, because its utility function is its own.  Any altruistic behavior that results, happens only out of self-interest.  You may argue that pragmatics argue against this use of the word "selfish" because it thus adds no meaning.  Fine.  I have removed the word "selfish".

However, it really doesn't matter.  Sure, it is possible to make a rational agent that acts in ways that seem unselfish. Irrelevant.  Why would the big boss settle for "unselfish" when he can get "self-sacrificing"?  It is often possible to find an irrational agent that acts more in your interests, than any rational agent will.  The rational agent aims for equitable utility deals.  The irrational agent can be inequitable in your favor.

This whole barrage of attacks on using the world 'selfish' are yet again missing the point.  If you read the entire post, you'll see that it doesn't matter if you think that rational agents are selfish, or that they can reciprocate.  You just have to admit that most persons A would rather deal with an agent B having an altruistic bias, or a bias towards A's utilities, than an agent having no such bias.  The level of selfishness/altruism of the posited rational agent is irrelevant, because adding a bias towards person A's utility is always better for person A.  Comparing "rational unbiased person" to "altruistic idiot" is not the relevant comparison here.  Compare instead "person using decision function F with no bias" vs. "person using decision function F with excess altruism".3

(Also note that, in the fMRI example, people don't get to see your utility function.  They can't tell that you have a wonderful  Yudkowskian utility function that will make you reliable.  They can only see that you don't have the bias most people do that would make most people a better employee.)

The real tricky point of this argument is whether you can define "irrational altruism" in a way that doesn't simply mean "utility function that values altruism".  You could rephrase "Choice by others encourages bias toward altruism" as "Choice by others selects for utility functions that value altruism highly".

Does an ant have an irrationally high bias towards altruism?  It may make more sense to say that an ant is less of an invididual, and more of a subroutine, than a human is.  So it is perfectly all right with me if you prefer to say that these forces select for valuing altruism, rather than saying that they select for bias.  The outcome is the same either way:  When one agent gets to choose what other agents succeed, and that agent can observe their biases and/or decision functions, those other agents are under selection pressure to become less like individuals and more like subroutines of the choosing agent.  You can call this "altruistic bias" or you can call it "less individuality".


There are a lot of other situations where one person chooses another person, and they would rather choose someone who is biased, in ways encouraged by society or by genetics, than someone more rational.  When giving a security clearance, for example, you would rather give it to someone who loved his country emotionally, than to someone who loved his country rationally; the former is more reliable, while the rational person may suddenly reach an opposite conclusion on learning one new fact.

It's hard to tell how altruistic someone is.  But the May 29, 2009 issue of Science has an article called "The Computation of Social Behavior".  It's extremely skimpy on details, especially for a 5-page article; but the gist of it is that they can use functional magnetic resonance imaging to monitor someone making decisions, and extract some of that person's basic decision-making parameters.  For example (they mention this, although it isn't clear whether they can extract this particular parameter), their degree of altruism (the value they place on someone else's utility vs. their own utility).  Unlike a written exam, the fMRI exam can't be faked; your brain will reveal your true parameters even if you try to lie and game the exam.

So, in the future, being rational may make you unemployable and unlovable, because you'll be unable to hide your rationality.

Or maybe it already does?


Here is the big picture:  The trend in the future is likely to be one of greater and greater transparency of every agent's internal operations, whether this is via fMRI or via exchanging source code.  Rationality means acting to achieve your goals.  There will almost always be other people who are more powerful than you and who have resources that you need, and they don't want you to achieve your goals.  They want you to achieve their goals.  They will have the power and the motive to select against rationality (or to avoid building it in in the first place.)

All our experience is with economic and behavioral models that assume independent self-interested agents.  In a world where powerful people can examine the utility functions of less-powerful people, and reward them for rewriting their utility functions (or just select ones with utility functions that are favorable to the powerful people, and hence irrational), then having rational, self-interested agents is not the equilibrium outcome.

In a world in which agents like you or I are manufactured to meet the needs of more powerful agents, even more so.

You may claim that an agent can be 'rational' while trying to attain the goals of another agent.  I would instead say that it isn't an agent anymore; it's just a subroutine.

The forces I am discussing in this post try to turn agents into subroutines.  And they are getting stronger.


1 Newcomb's paradox is, strangely, more familiar to LW readers.  I suggest replacing discussions of one-boxing by discussions of taking revenge; I think the paradoxes are very similar, but the former is more confusing and further-removed from reality.  Its main advantage is that it prevents people from being distracted by discussing ways of fooling people about your intentions - which is not the solution evolution chose to that problem.

2 I'm making basically the same argument that Christians make when they say that atheists can't be trusted.  Empirical rejection of that argument does not apply to mine, for two reasons:

  1. Religions operate on pure rewards-based incentives, and hence destroy the altruistic instinct; therefore, I intuit that religious people have a disadvantage rather than an advantage compared to altruists WRT altruism.
  2. Religious people can sometimes be trusted more than atheists; the problem is that some of the things they can be trusted to do are crazy.

3 This is something LW readers do all the time:  Start reading a post, then stop in the middle and write a critical response addressing one perceived error whose truth or falsity is actually irrelevant to the logic of the post.