CEV is (imo) a good concept for what we ultimately want and/or what humanity should try to achieve, but I've always found it hard to pithily talk about the intermediate world states to aim for now if we want CEV eventually.
I've heard the practical goal discussed as "a world from which we can be as sure as possible that we will achieve CEV". Doesn't really roll off the tongue. It would be nice to have a cleaner shorthand.
The term "viatopia" seems meant to capture the same idea: https://newsletter.forethought.org/p/viatopia
This also seems like the sort of thing Bostrom might have coined a term for fifteen years ago in some obscure paper.
I'd be interested in hearing any other terms or phrases that you think make talking about an intermediate goal state from which CEV is very likely (or as likely as possible) easier.
The two important conversations I'd like to be able to have are "what are the features of a realistic <state>?" and "how can we achieve <state>?" with participants having a shared understanding of what we're talking about with <state>.
bostrom uses "existential security" to refer to this intermediate goal state IIRC -- referring to a state where civilization is no longer facing significant risk of extinction or things like stable totalitarianism. this phrase connotes sort of a chill, minimum-viable utopia (just stop people from engineering super-smallpox and everything else stays the same, m'kay?), but I wonder if actual "existential security" might be essentially equivalent to locking in a very specific and as-yet-undiscovered form of governance conducive to suppressing certain dangerous technologies without falling into broader anti-tech stagnation, avoiding various dangers of totalitarianism and fanaticism, etc... https://forum.effectivealtruism.org/posts/NpYjajbCeLmjMRGvZ/human-empowerment-versus-the-longtermist-imperium
yudkowsky might have had a term (perhaps in his fun-theory sequence?) referring to a kind of intermediate utopia where humanity has covered "the basics" of things like existential security plus also some obvious moral goods like individual people no longer die + extreme suffering has been abolished + some basic level of intelligence enhancement for everybody + etc
some people talk about the "long reflection" which is similar to the concept of viatopia, albeit with more of a "pause everything" vibe that seems less practical for a bunch of reasons
it seems like it would be pretty useful for somebody to be thinking ahead about the detailed mechanics of different idealization processes (since maybe such processes do not "converge", and doing things in a slightly different way / slightly different order might send you to very different ultimate destinations: https://joecarlsmith.com/2021/06/21/on-the-limits-of-idealized-values), even though this is probably not super tractable until it becomes clearer what kinds of "idealization technologies" will actually exist when, and what their possible uses will be (brain-computer interfaces, nootropic drugs or genetic enhancement procedures, AI advisors, "Jhourney"-esque spiritual-attainment-assistance technologies, improved collective decisionmaking technologies / institutions, etc)
CEV is not meant to depend on the state of human society. It is supposed to be derived from "human nature", e.g. genetically determined needs, dispositions, norms and so forth, that are characteristic of our species as a whole. The quality of the extrapolation process is what matters, not the social initial conditions. You could be in "viatopia", and if your extrapolation theory is wrong, the output will be wrong. Conversely, you could be in a severe dystopia, and so long as you have the biological facts and the extrapolation method correct, you're supposed to arrive at the right answer.
I have previously made the related point that the outcome of CEV should not be different, whether you start with a saint or a sinner. So long as the person in question is normal Homo sapiens, that's supposed to be enough.
Similarly, CEV is not supposed to be about identifying and reconciling all the random things that the people of the world may want at any given time. It is supposed to identify a value system or decision procedure which is the abstract kernel of how the smarter and better informed version of the human race would want important decisions to be made, regardless of the details of circumstance.
This is, I argue, all consistent with the original intent of CEV. The problem is that neither the relevant facts defining human nature, nor the extrapolation procedure, are known or specified with any rigor. If we look at the broader realm of possible Value Extrapolation Procedures, there are definitely some "VEPs" in which the outcome depends crucially on the state of society, the individuals who are your prototypes, and/or even the whims of those individuals at the moment of extrapolation.
Furthermore, it is likely that individual genotypic variation, and also the state of culture, really can affect the outcome, even if you have identified the "right" VEP. Culture can impact human nature significantly, and so can genetic variation.
I think it's probably for the best that the original manifesto for CEV, was expressed in these idealistic terms - that it was about extrapolating a universal human nature. But if "CEV theory" is ever to get anywhere, it must be able to deal with all these concrete questions.
(For examples of CEV-like alignment proposals that include dependence on neurobiological facts, see PRISM and metaethical.ai.)
I used to worry that people taking life extension seriously, particularly around these parts, was a bad thing for AI risk. If the people working to build AI believe that death is very bad, that death is technically solvable, that they and their loved ones will die by default, and that building superintelligence in their lifetime is their best shot at longevity escape velocity, then they have a strong incentive to move as quickly as possible. Getting AI researchers to believe more strongly in any of these four ideas has always seemed like a dubious plan at best.
Recently I've changed my mind somewhat, and I now think that longevity research and life extension as an ideology might end up being important for efforts to slow down AI development.
If life extension looks promising and is taken seriously, relevant decision makers might be more willing to slow down or pause AI -- both because they personally face less time pressure and because if they think they're going to be around for a while, they have more of a personal stake in not having the future end soon.
Two ways this could become relevant:
(1) We see real progress in life extension research. There are some exciting things happening (IMO) in anti-aging research today, and there's massively more popular interest (hello Bryan Johnson) and funding available. My optimism about this research, even absent magical AI breakthroughs, makes me think we can get radical anti-aging tech before superintelligence.
(2) In messaging a pause or slowdown of AI research, there is a specific exception made for certain kinds of medical and life-extension related research.
There's an obvious tradeoff in (2). We can't simply specify "we will only use AI for X". It's not possible to guarantee that any medical research we do will not contribute at all to general superintelligence research.
However, "pause except we still do AI-based life-extension research" might end up being much more palatable than "pause". And if that makes pause much more politically viable, it might be worth the risk.
This would require relevant parties to believe that serious anti-aging technology is possible by means of current research and whitelisted narrow AI. That in turn might mean that proselytizing for life extension is in fact one of the more useful things we can do for the development of safe AI.
(I'm assuming this is not new ground, but I haven't read anything on exactly this topic, at least not that I can remember and not in a long time. Links appreciated!)