Harry Potter and the Methods of Rationality discussion thread, February 2015, chapter 104
New chapter! This is a new thread to discuss Eliezer Yudkowsky’s Harry Potter and the Methods of Rationality and anything related to it. This thread is intended for discussing chapter 104. There is a site dedicated to the story at hpmor.com, which is now the place to go to find the authors notes and all sorts of other goodies. AdeleneDawner has kept an archive of Author’s Notes. (This goes up to the notes for chapter 76, and is now not updating. The authors notes from chapter 77 onwards are on hpmor.com.) Spoiler Warning: this thread is full of spoilers. With few exceptions, spoilers for MOR and canon are fair game to post, without warning or rot13. More specifically: > You do not need to rot13 anything about HP:MoR or the original Harry Potter series unless you are posting insider information from Eliezer Yudkowsky which is not supposed to be publicly available (which includes public statements by Eliezer that have been retracted). > > If there is evidence for X in MOR and/or canon then it’s fine to post about X without rot13, even if you also have heard privately from Eliezer that X is true. But you should not post that “Eliezer said X is true” unless you use rot13.
[Epistemic status: Unpolished conceptual exploration, possibly of concepts that are extremely obvious and/or have already been discussed. Abandoning concerns about obviousness, previous discussion, polish, fitting the list-of-principles frame, etc. in favor of saying anything at all.] [ETA: Written in about half an hour, with some distraction and wording struggles.]
What is the hypothetical ideal of a corrigible AI? Without worrying about whether it can be implemented in practice or is even tractable to design, just as a theoretical reference to compare proposals to?
I propose that the hypothetical ideal is not an AI that lets the programmer shut it down, but an AI that wants to be corrected - one that will allow a... (read more)