Review

Part 4 of AI, Alignment, and Ethics. This will probably make more sense if you start with Part 1.

TL;DR In Parts 1 and 2 I argued that aligned AIs will not want and should not get assigned moral worth or the sorts of rights that human citizens get. In Part 3 I suggested that uploads should (with some specifics and rather strong constraints). Next I'd like to suggest a guiding principle for the largest group who should get these things.

A society can use any criteria it likes for membership/citizenship. The human moral intuition for fairness only really extends to "members of the same primate troupe as me". Modern high-tech societies have economic and stability incentives to extend this to every human in the entire global economic system, and the obvious limiting case for that is the entire planet, or once we go interplanetary, the entire solar system.

However, there is a concern that obviously arbitrary rules for membership of the society might not be stable under challenges like self-reflection or self-improvement by an advanced AI. One might like to think that if someone attempted to instill in all the AIs the rule that the criterion for being a citizen with rights was, say, descent from William Rockefeller Sr. (and hadn't actually installed this as part of a terminal goal, just injected it into the AI's human value learning process with a high prior), sooner or later a sufficiently smart AI might tell them "that's extremely convenient for your ruling dynasty, but doesn't fit the rest of human values, or history, or jurisprudence, or biology. So no."

So it would be nice to have a criterion that makes some logical sense. Not necessarily a "True Name" of citizenship, but at least a solid rationally-defensible position with as little wiggle room as possible.

I'd like to propose what I think is one: "an intelligent agent should be assigned moral worth only if it is (or primarily is, or is a functionally-equivalent very-high-accuracy emulation of) a member of sapient species whose drives were produced by natural selection. (This moral worth may vary if its drives or capabilities have been significantly modified from their evolved versions, details TBD.)"

The political argument defending this is as follows:

Living organisms have homeostasis mechanism: they seek to maintain aspects of their bodies and environment in certain states, even when (as is often the case) those are not thermodynamic equilibria. Unlike something weakly agentic like a thermostat, they are self propagating complex dynamic systems, and natural selection ensures that the equilibria they maintain are ones important to that process: they're not arbitrary, easily modified, or externally imposed, like those for a thermostat. If you disturb any these equilibria they suffer, and if you disturb it too much, they die. ("Suffer" and "die" here should be regarded as technical terms in Biology, not as moral terms.) So the things that evolved organisms care about are things that genuinely matter to them, not something that can be easily redesigned any way you like. Living things have a lot of interesting properties (which is why Biology is a separate scientific field): for example, they're complex, self sustaining, dynamic processes that are shaped by evolutionary design algorithms. Also, humans generally think they're neat (at least unless the organism is prone to causing humans suffering).

The word 'sapient' is doing a lot of work in that definition, and its not currently scientifically a very well defined term. A short version of the definition that I mean here might be "having the same important social/technological properties that on Earth are currently unique to Homo sapiens, but are not inherently unique". A more detailed definition would be "a species with the potential capability to transmit a lot more information from one generation to the next by cultural means than just by genetic means". This is basically the necessary requirement for a species to become technological. A species that hasn't yet developed technology, but already has this capability, still deserves moral worth. For comparison, we've tried teaching human (sign) languages to chimps, gorillas, and even dogs, and while they're not that bad at this, they clearly lack the level of mental/linguistic/social capacity required to use it for significant amounts of cultural transmission.

It's clear from the fossil and historical record that Homo sapiens' technological base has been pretty-continuously advancing, on what looks like a J-shaped superexponential curve, since somewhere around the origin of our species, ~300,000 years ago. In contrast, during the ~500,000 years that Neanderthals were around, their stone tool making technology didn't perceptibly advance: tools from half a million years apart are functionally equivalent and basically indistinguishable. So their craft skills were clearly constrained by some form of biologically-imposed transmission bottle-neck, and were limited at that capacity the whole time. Our technology, on the other hand, has ramped all the way past nuclear power, space travel, and artificial intelligence, way outside the ecological niche we evolved in, and still haven't hit any clear capacity limits yet. So while there have been arguments among paleontologists about whether Homo sapiens and Homo neanderthalis should be treated as just different subspecies of the same species, since they can and to a non-trivial extent have interbred, I regard this change as a speciation event worthy of a whole new Linnaean species name (in fact, I might even argue it's worth a new genus). It is not a coincidence that within 300,000 years of our species appearing with this unprecedented capacity, we're sufficiently dominant in every land ecosystem on the planet that we're causing a mass-extinction event, and geologists are now calling this the Anthropocene.

We don't really understand why this dramatic sudden capability breakthrough effect is possible. (My personal theory is that it's likely to be a phenomenon analogous to Turing completeness in some combination things involved in the social transmission of skills between generations, perhaps some combination a capability to handle a richer language syntax/vocabulary, and perhaps also better mirror neurons.) Nevertheless, it's a striking phenomenon, which so far has happened on Earth only once. However, there's no obvious reason not to expect it to be feasible for other (likely social) intelligent species, such as on other planets.

So, why should sapience be a criterion for citizenship/full moral worth? Because if a species has it, and you don't grant them citizenship (and also don't render them extinct), sooner or later they will complain about this, likely using weapons of mass destruction. So if you meet a sapient alien species, take a good look at them and figure out if you can cooperate with them successfully, because there are (presumably) only two eventual options: friendly cooperation, or a war of annihilation. This is of course a "might makes right" argument, but it fits well with my criterion 1. for ethical systems given in A Sense of Fairness: design ethical systems for societies that aren't going to lead to nuclear war or similar x-risks.

Also, clearly, this fits fairly well with human values. Obviously we think that we should have rights/votes/moral worth. Evolved sapient beings is about the largest set that I think we might be able to safely expand this to. [In Part 5, I'll discuss why I think it would be very difficult and x-risky to expand this further to all sentient living creatures.]

My use of the term "chauvinism" in the title og this post was intentionally self-deprecatory humor (a play on carbon-chauvinism from Part 1 and Part 3); however, now that I'm seriously proposing building a society where this is a fundamental moral principle, we probably need a less tongue-in-cheek-pejorative name for it. How about calling it the principle of "evolved-sapient moral community"? Or, if you prefer a shorter neologism, 'sapient-biovalorism'?

New Comment