Coherent Extrapolated Volition

Yoav Ravid	v1.6.0Dec 13th 2023	(+411)
Christopher King	v1.5.0May 13th 2023	(+7/-8)
Filipe Marchesini	v1.4.0Mar 11th 2022	(+6/-5)
Ruby	v1.3.0Sep 16th 2020	(+2627/-258)
Multicore	v1.2.0Aug 27th 2020
Multicore	v1.1.0Aug 27th 2020	(+88)
Gyrodiot	v1.0.0Aug 22nd 2020	(+62/-2165)
Tyrrell_McAllister	v0.0.13Feb 21st 2019	/* Further Reading & References */ Replaced dead link to Anissimov article with an archive.org archive.
plex	v0.0.12Aug 27th 2016
greenrd	v0.0.11Mar 12th 2016	(+268) /* See also */ added roko's basilisk, and a short summary of it - yes, it sounds bizarre and flaky - it is meant to

Load More (10/21)

Yoav Ravid v1.6.0Dec 13th 2023 (+411) 4

"The "Coherent" in "Coherent Extrapolated Volition" does not indicate the idea that an extrapolated volition is necessarily coherent. The "Coherent" part indicates the idea that if you build an FAI and run it on an extrapolated human, the FAI should only act on the coherent parts. Where there are multiple attractors, the FAI should hold satisficing avenues open, not try to decide itself." - Eliezer Yudkowsky

Discuss this tag (0)

Christopher King v1.5.0May 13th 2023 (+7/-8) 3

Now imagine someone else – Fred – is faced with the same task and you want to help him in his decision by giving the box he chose, box A. Since you know where the diamond is, simply ~~handling~~handing him the box isn’t helping. As such, you mentally extrapolate a volition for Fred, based on a version of him that knows where the diamond is, and imagine he actually wants box B.

Discuss this tag (0)

Filipe Marchesini v1.4.0Mar 11th 2022 (+6/-5) 3

As an example of the classical concept of volition, the author develops a simple thought experiment: imagine you’re facing two boxes, A and B. One of these boxes, and only one, has a diamond in it – box B. You are now asked to make a guess, whether to ~~chose~~choose box A or B, and you chose to open box A. It was your decision to take box A, but your volition was to choose box B, since you wanted the diamond in the first place.

Discuss this tag (0)

Ruby v1.3.0Sep 16th 2020 (+2627/-258) 2

~~In developing aligned AI, one acting for our best interests, we would have to take care~~Coherent Extrapolated Volition was a term developed by Eliezer Yudkowsky while discussing Friendly AI development. It’s meant as an argument that it would ~~have implemented, from the beginning,~~not be sufficient to explicitly program what we think our desires and motivations are into an AI, instead, we should find a ~~coherent extrapolated volition~~way to program it in a way that it would act in our best interests – what we want it to do and not what we tell it to.

In calculating CEV, an AI would predict what an idealized version of us would want, ~~"if~~"if we knew more, thought faster, were more the people we wished we were, had grown up farther ~~together"~~together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the AI's utility function.

Often CEV is used generally to refer to what the idealized version of a person would want, separate from the context of building aligned AI's.

What is volition?

As an example of the classical concept of volition, the author develops a simple thought experiment: imagine you’re facing two boxes, A and B. One of these boxes, and only one, has a diamond in it – box B. You are now asked to make a guess, whether to chose box A or B, and you chose to open box A. It was your decision to take box A, but your volition was to choose box B, since you wanted the diamond in the first place.

Now imagine someone else – Fred – is faced with the same task and you want to help him in his decision by giving the box he chose, box A. Since you know where the diamond is, simply handling him the box isn’t helping. As such, you mentally extrapolate a volition for Fred, based on a version of him that knows where the diamond is, and imagine he actually wants box B.

Coherent Extrapolated Volition

In developing friendly AI, one acting for our best interests, we would have to take care that it would have implemented, from the beginning, a coherent extrapolated volition of humankind. In calculating CEV, an AI' would predict what an idealized version of us would want, "if we knew more, thought faster, were more the people we wished we were, had grown up farther together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the AI's utility function.

The main problems with CEV include, firstly, the great difficulty...

Volition

As an example of the classical concept of volition, the author develops a simple thought experiment: imagine you’re facing two boxes, A and B. One of these boxes, and only one, has a diamond in it – box B. You are now asked to make a guess, whether to chose box A or B, and you chose to open box A. It was your ~~decision~~ ~~to take box A, but your~~ ~~volition~~ ~~was to choose box B, since you wanted the diamond in the first place.~~

Coherent Extrapolated Volition

In developing ~~friendly~~aligned AI, one acting for our best interests, we would have to take care that it would have implemented, from the beginning, a coherent extrapolated volition of humankind. In calculating CEV, an AI would predict what an idealized version of us would want, ~~"if~~"if we knew more, thought faster, were more the people we wished we were, had grown up farther ~~together"~~together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the ~~AI'~~AI's utility function.

The main problems with CEV include, firstly, the great difficulty of implementing such a program - ~~“If~~“If one attempted to write an ordinary computer program using ordinary computer programming skills, the task would be a thousand lightyears beyond hopeless.”” Secondly, the possibility that human values may not converge. Yudkowsky considered CEV obsolete almost immediately after its publication in 2004. He states that ~~there'~~there's a ~~"principled~~"principled distinction between discussing CEV as an initial dynamic of Friendliness, and discussing CEV as a Nice Place to ~~Live"~~Live" and his essay was essentially conflating the two definitions.

Coherent Extrapolated Volition

What is volition?

Coherent Extrapolated Volition

Volition

Coherent Extrapolated Volition

Further Reading & References

See also