Let us define a morality function F() as taking as input x=the factual circumstances an agent faces in making a decision, outputting y=the decision the agent makes. It is fairly apparent that practically every agent has an F(). So ELIEZER(x) is the function that describes what Eliezer would choose in situation x. Next, define GROUP{} as the set of morality functions run by all the members of that group.

Let us define CEV() as the function that takes as input a morality function or set of morality functions and outputs a morality function that is improved/made consistent/extrapolated from the input. I'm not asserting the actual CEV formulation will do that, but it is a gesture towards the goal that CEV() is supposed to solve.

For clarity, let the output of CEV(F()) = CEV.F(). Thus, CEV.ELIEZER() is the extrapolated morality from the morality Eliezer is running. In parallel CEV.AMERICA() (which is the output of CEV(AMERICA{})) the single moral function that is the extrapolated morality of everyone in the United States. If CEV() exists, an AI considering/implementing CEV.JOHNDOE() is Friendly to John Doe. Likewise, CEV.GROUP() leads to an AI that is Friendly to every member of the group.

For FAI to be possible, CEV() must output for (A) any morality function or (B) set of morality functions. Further, for provable FAI, it must be possible to (C) mathematically show the output of CEV() before turning on the AI.

If moral realism is false, why is there reason to think (A), (B), or (C) are true?

For FAI to be possible, CEV() must output for (A) any morality function or (B) set of morality functions

Any set? Why not just require that CEV.HUMANITY() be possible? It seems like there are some sets of morality functions G that would be impossible (G={x, ~x}?). Human value is really complex so it's a difficult thing to a)model it and b) prove the model. Obviously I don't know how to do that; no one does yet. If moral realism were true and morality were simple and knowable I suppose that would make the job a lot easier... but that doesn't seem like a w... (read more)

Stupid Questions Open Thread Round 3

by OpenThreadGuy 1 min read7th Jul 2012209 comments

8


From the last thread:

From Costanza's original thread (entire text):

"This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well.  Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent.  If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant."

Meta:

  • How often should these be made? I think one every three months is the correct frequency.
  • Costanza made the original thread, but I am OpenThreadGuy. I am therefore not only entitled but required to post this in his stead. But I got his permission anyway.

Meta:

 

  • I still haven't figured out a satisfactory answer to the previous meta question, how often these should be made. It was requested that I make a new one, so I did.
  • I promise I won't quote the entire previous threads from now on. Blockquoting in articles only goes one level deep, anyway.