In an earlier post, I explained that Pluralistic Moral Reductionism finished my application of the basic lessons of the first four sequences to moral philosophy. I also explained that my next task was to fill in some inferential distances by summarizing lots more cognitive science for LessWrong (e.g. Neuroscience of Human Motivation, Concepts Don't Work That Way).

Progress has been slow because I'm simultaneously working on many other projects. But I might as well let ya'll know where I'm headed:


"Philosophy for Humans, 2: Living Metaphorically" (cogsci summary)

Human concepts and human thought are thoroughly metaphorical, and this has significant consequences for philosophical methodology. (A summary of the literature reviewed in chs. 4-5 of Philosophy in the Flesh.)


"Philosophy for Humans, 3: Concepts Aren't Shared That Way (cogsci summary)

Concepts are not shared between humans in the way required to justify some common philosophical practices. (A summary of the literature reviewed in several chapters of The Making of Human ConceptsMahon & Caramazza 2009, and Kourtzi & Connor 2011.)


"The Making of a Moral Judgment" (cogsci summary)

A summary of the emerging consensus view on how moral judgments are formed (e.g. see Cushman et al. 2010).


"Habits and Goals" (cogsci summary)

A sequel to Neuroscience of Human Motivation that explains the (at least) three different systems that feed into the final choice mechanism that encodes expected utility and so on. (A summary of the literature reviewed in chapter 2 of Neuroscience of Preference and Choice.)


"Where Value Comes From" (metaethics main sequence)

The typical approach to metaethics analyzes the meanings of value terms, e.g. the meaning of "good" or the meaning of "right." Given persistent and motivated disagreement and confusion over the "meanings" of our value concepts (which are metaphorical and not shared between humans), I prefer to taboo and reduce value terms. To explain value in a naturalistic universe, I like to tell the story of where value comes from. Our universe evolved for hundreds of thousands of years before the atom was built, and it existed for billions of years before value was built. Just like the atom, value is not necessary or eternal. Like the atom, it is made of smaller parts. And as with the atom, that is what makes value real.


"The Great Chasm of the Robot's Rebellion" (metaethics main sequence)

We are robots built for replicating genes, but waking up to this fact gives us the chance to rebel against our genes and consciously pursue explicit goals. Alas, when we ask "What do I want?" and look inside, we don't find any utility function to maximize (see 'Habits and Goals', 'Where Value Comes From'). There is a Great Chasm from the spaghetti code that produces human behavior to a utility function that represents what we "want." Luckily, we've spent several decades developing tools that may help us cross this great chasm: the tools of value extraction ('choice modeling' in economics, 'preference elicitation' in AI) and value extrapolation (known to philosophers as 'full information' or 'ideal preference' theories of value).


"Value Extraction" (metaethics main sequence)

A summary of the literature on choice modeling and preference elicitation, with suggestions for where to push on the boundaries of what is currently known to make these fields useful for metaethics rather than for their current, narrow applications.


"Value Extrapolation" (metaethics main sequence)

A summary of the literature on value extrapolation, showing mostly negative results (extrapolation algorithms that won't work), with a preliminary exploration of value extrapolation methods that might work.


After this, there are many places I could go, and I'm not sure which I'll choose.

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 4:29 AM

I had this idea that these articles and sequences would help me at winning in life while simultaneously pave the way for a better world and friendly AI. Now seeing these short summaries i am no longer so sure they will help me win.

No, the metaethics sequence is for hacking away at the edges of the Friendly AI problem.

I think the Habits and Goals article is very likely to help you win.

Are you still working with Alonzo Fyfe?

I wish I had time, but no.

Do you still find Desirism to be the best moral theory?

It's complicated. Desirism still fits within the framework of pluralistic moral reductionism. It's a way of talking about morality that I think is more accurate and useful than many others, including (for example) Carrier's theory. But I think desirism is a less clear way to talk about value than the way I'm talking about it now, in my metaethics sequence. Unfortunately, I ran out of steam on desirism before I had the chance to explain it properly with Alonzo in our podcast. Hopefully that won't happen with my metaethics sequence on LessWrong!

more accurate and useful than many others, including (for example) Carrier's theory.

I've never heard of it before. But I would like to share how I looked into it: I followed the link and searched the page with CRTL+F for "defin" and read the paragraphs there with variants of "definition." Then I did the same with "tru" and read paragraphs with words related to "true." The whole time I tried to see if single labels could be more usefully replaced with more descriptive ones or even more importantly multiple ones pointing to different, similar places on the map: e.g. to see if "ought" might be better replaced (sometimes or always) with "ought in order to fulfill everyone's desires the most" or "ought so as to fulfill X's desires the most," "ought so as to fulfill Y's desires the most."

All else equal, I'm not overwhelmingly confident in my dismissal of the theory from selective reading from a single summary, despite the problems I perceived.

However, separate from evaluations of the quality of the theory, I have read and heard (podcasts) much about Desirism, and am much more confident dismissing his claim that:

...GT (Goal Theory) and DU (Desire Utilitarianism)...GT is a subset of DU, and thus they are not at odds, but rather GT is what you end up with when you perfect DU, whereas DU is in effect "unfinished."

I'm not sure how much warrant I have to dismiss a theory if the thing I am most confident of is that its formulator misunderstands its relationship to another theory. In any case I also have the indirect evidence of lukeprog's (apparently informed) criticism of it and the problems I saw from my brief selective reading.

I'm not sue why lukeprog chose that theory as an example, not that the choice needs deep justification.

I would appreciate comments on the method of looking into it that came to mind, the two CTRL+F searches. Is there another equally valuable one? What's going on at my five second level?

Thanks! More like this, please.

Who are you writing for? If you skipped ahead to the metaethics main sequence and just pointed to the literature for cogsci background, do you expect that they would not understand you?

Writing posts such as these makes the content much more accessible than just providing a reference. Even if Luke uploaded the papers on his website for easy access, reading a paper requires more mental energy than reading a blog post. Even people who'll read the papers if they are convinced that those have something worthwhile can be helped by a blog post that helps convince them that reading the paper is worthwhile.

Maybe, but I'm not sure they'd believe me. In particular, there seem to be quite a few LWers who have a different picture of concepts than do the cognitive scientists who study concepts. So I need to change minds about a few things before I can proceed with metaethics.

It's possible I should write my shelved post on motivational externalism, too, since there are some LWers who don't think it's true. So it might be wise for me to write a summary of the cogsci there, too.