This post was rejected for the following reason(s):

  • Low Quality or 101-Level AI Content. There’ve been a lot of new users coming to LessWrong recently interested in AI. To keep the site’s quality high and ensure stuff posted is interesting to the site’s users, we’re currently only accepting posts that meets a pretty high bar. We look for good reasoning, making a new and interesting point, bringing new evidence, and/or building upon prior discussion. If you were rejected for this reason, possibly a good thing to do is read more existing material. The AI Intro Material wiki-tag is a good place, for example. You're welcome to post quotes in the latest AI Questions Open Thread.

  • This seemed like a fine question but I'd put it in the Open Thread

In his TED Talk, Bostrom's solution to the alignment problem is to build in at the ground level the goal of predicting what we will approve of so that no matter what other goals it's given (cure cancer, colonize Mars, ...), it will aim to achieve those goals only in ways that align with our values. (I see some challenges: One is Who exactly is the WE? Another is the fact that humans can approve of some horrendous things including, under certain conditions, their own subjugation.)

Is there any place where I can find objections/challenges to Bostrom's proposal? Particularly, what do Yudkowsky and others with his outlook say about it? I can make guesses based on what Yudkowsky says, but so far, I've found no explicit mention by Yudkowsky of Bostrom's solution. Generally, where should I go to find any objections to or evaluations of Bostrom's solution?

New Answer
New Comment