Keith Stanovich is a leading expert on the cogsci of rationality, but he also also written on a problem related to CEV, that of the "rational integration" of our preferences. Here he is on pages 81-86 of Rationality and the Reflective Mind (currently my single favorite book on rationality, out of the dozens I've read):

All multiple-process models of mind capture a phenomenal aspect of human decision making that is of profound importance — that humans often feel alienated from their choices. We display what folk psychology and philosophers term weakness of will. For example, we continue to smoke when we know that it is a harmful habit. We order a sweet after a large meal, merely an hour after pledging to ourselves that we would not. In fact, we display alienation from our responses even in situations that do not involve weakness of will — we find ourselves recoiling from the sight of a disfigured person even after a lifetime of dedication to diversity and inclusion.

This feeling of alienation — although emotionally discomfiting when it occurs — is actually a reflection of a unique aspect of human cognition: the use of Type 2 metarepresentational abilities to enable a cognitive critique of our beliefs and our desires. Beliefs about how well we are forming beliefs become possible because of such metarepresentation, as does the ability to evaluate one's desires — to desire to desire differently...

...There is a philosophical literature on the notion of higher-order evaluation of desires... For example, in a classic paper on second-order desires, Frankfurt (1971) speculated that only humans have such metarepresentational states. He evocatively termed creatures without second-order desires (other animals, human babies) wantons... A wanton simply does not reflect on his/her goals. Wantons want — but they do not care what they want.

Nonwantons, however, can represent a model of an idealized preference structure — perhaps, for example, a model based on a superordinate judgment of long-term lifespan considerations... So a human can say: I would prefer to prefer not to smoke. This second-order preference can then become a motivational competitor to the first-order preference. At the level of second-order preferences, I prefer to prefer to not smoke; nevertheless, as a first-order preference, I prefer to smoke. The resulting conflict signals that I lack what Nozick (1993) terms rational integration in my preference structures. Such a mismatched first-/second-order preference structure is one reason why humans are often less rational than bees in an axiomatic sense (see Stanovich 2004, pp. 243-247). This is because the struggle to achieve rational integration can destabilize first-order preferences in ways that make them more prone to the context effects that lead to the violation of the basic axioms of utility theory (see Lee, Amir, & Ariely 2009).

The struggle for rational integration is also what contributes to the feeling of alienation that people in the modern world often feel when contemplating the choices that they have made. People easily detect when their high-order preferences conflict with the choices actually made.

Of course, there is no limit to the hierarchy of higher-order desires that might be constructed. But the representational abilities of humans may set some limits — certainly three levels above seems a realistic limit for most people in the nonsocial domain (Dworking 1988). However, third-order judgments can be called upon to to help achieve rational integration at lower levels. So, for example, imagine that John is a smoker. He might realize the following when he probes his feelings: He prefers his preference to prefer not to smoke over his preference for smoking.

We might in this case say that John's third-order judgment has ratified his second-order evaluation. Presumably this ratification of his second-order judgment adds to the cognitive pressure to change the first-order preference by taking behavioral measures that will make change more likely (entering a smoking secession program, consulting his physician, staying out of smoky bars, etc.).

On the other hand, a third-order judgment might undermine the second-order preference by failing to ratify it: John might prefer to smoke more than he prefers his preference to prefer not to smoke.

In this case, although John wishes he did not want to smoke, the preference for this preference is not as strong as his preference for smoking itself. We might suspect that this third-order judgment might not only prevent John from taking strong behavioral steps to rid himself of his addiction, but that over time it might erode his conviction in his second-order preference itself, thus bringing rational integration to all three levels.

Typically, philosophers have tended to bias their analyses toward the highest level desire that is constructed — privileging the highest point in the regress of higher-order evaluations, using that as the foundation, and defining it as the true self. Modern cognitive science would suggest instead a Neurathian project in which no level of analysis is uniquely privileged. Philosopher Otto Neurath... employed the metaphor of a boat having some rotten planks. The best way to repair the planks would be to bring the boat ashore, stand on firm ground, and replace the planks. But what if the boat could not be brought ashore? Actually, the boat could still be repaired but at some risk. We could repair the planks at sea by standing on some of the planks while repairing others. The project could work — we could repair the boat without being on the firm foundation of ground. The Neurathian project is not guaranteed, however, because we might choose to stand on a rotten plank. For example, nothing in Frankfurt's (1971) notion of higher-order desires guarantees against higher-order judgments being infected by memes... that are personally damaging.

Also see: The Robot's Rebellion, Higher order preferences the master rationality motive, Wanting to WantThe Human's Hidden Utility Function (Maybe), Indirect Normativity

New to LessWrong?

New Comment
6 comments, sorted by Click to highlight new comments since: Today at 2:50 AM

He evocatively termed creatures without second-order desires (other animals, human babies) wantons

I'm wondering if someone can link any research on when and how human babies develop second order wants?

[-]Jack12y30

Also, see Alicorn's "Wanting to Want" and the accompanying comments.

Added, thanks.

First and second order wants strike me as having to do with near and far modes. (I also suspect near and far explain hyperbolic discounting/procrastination, and maybe also why the Wason selection task is hard (abstractness being associated with far mode) and maybe the endowment effect.)

Nonwantons, however, can represent a model of an idealized preference structure — perhaps, for example, a model based on a superordinate judgment of long-term lifespan considerations... So a human can say: I would prefer to prefer not to smoke. This second-order preference can then become a motivational competitor to the first-order preference. At the level of second-order preferences, I prefer to prefer to not smoke; nevertheless, as a first-order preference, I prefer to smoke.

One problem: How do we distinguish actual second-order preferences ("I would prefer to prefer not to smoke") from improper beliefs about one's own preferences, e.g. belief in belief ("It is good to think that smoking is bad")?

It seems to me that the obvious answer is to ask, "Well, is smoking actually bad?" In other words, we shouldn't expect to find out how good our reflective preferences are without actually asking what sort of world we live in, and whether agents with those preferences tend to do well in that sort of world.

"Actually bad" and "do well" depend on values, right? So that seems like the start to a better approach, but isn't enough.