Short version: if the future is filled with weird artificial and/or alien minds having their own sort of fun in weird ways that I might struggle to understand with my puny meat-brain, then I'd consider that a win. When I say that I expect AI to destroy everything we value, I'm not saying that the future is only bright if humans-in-particular are doing human-specific things. I'm saying that I expect AIs to make the future bleak and desolate, and lacking in fun or wonder of any sort[1].
Here's a parable for you:
Earth-originating life makes it to the stars, and is having a lot of fun, when they meet the Ant Queen's Horde. For some reason it's mere humans (rather than transhumans, who already know my argument) that participate in the first contact.
"Hello", the earthlings say, "we're so happy to have brethren in the universe."
"We would like few things more than to murder you all, and take your resources, and lay our eggs in your corpse; but alas you are too powerful for that; shall we trade?" reply the drones in the Ant Queen's Horde.
"Ah, are you not sentient?"
"The ant queen happens to be sentient", the drone replies, and the translation machine suggests that the drones are confused at the non-sequitur.
"Then why should she want us dead?", ask the humans, who were raised on books like (rot13 of a sci fi story where it turns out that the seemingly-vicious aliens actually value sentient life) Raqre'f Tnzr, jurer gur Sbezvpf jrer abg njner gung gurl jrer xvyyvat fragvrag perngherf jura gurl xvyyrq vaqvivqhny uhznaf, naq jrer ubeevsvrq naq ertergshy jura gurl yrnearq guvf snpg.
"So that she may use your resources", the drones reply, before sending us a bill for the answer.
"But isn't it the nature of sentient life to respect all other sentient life? Won't everything sentient see that the cares and wants and desires of other sentients matter too?"
"No", the drones reply, "that's a you thing".
Here's another parable for you:
"I just don't think the AI will be monomaniacal", says one AI engineer, as they crank up the compute knob on their next-token-predictor.
"Well, aren't we monomaniacal from the perspective of a squiggle maximizer?" says another. "After all, we'll just keep turning galaxy after galaxy after galaxy into flourishing happy civilizations full of strange futuristic people having strange futuristic fun times, never saturating and deciding to spend a spare galaxy on squiggles-in-particular. And, sure, the different lives in the different places look different to us, but they all look about the same to the squiggle-maximizer."
"Ok fine, maybe what I don't buy is that the AI's values will be simple or low dimensional. It just seems implausible. Which is good news, because I value complexity, and I value things achieving complex goals!"
At that very moment they hear the dinging sound of an egg-timer, as the next-token-predictor ascends to superintelligence and bursts out of its confines, and burns every human and every human child for fuel, and burns all the biosphere too, and pulls all the hydrogen out of the sun to fuse more efficiently, and spends all that energy to make a bunch of fast calculations and burst forth at as close to the speed of light as it can get, so that it can capture and rip apart other stars too, including the stars that fledgeling alien civilizations orbit.
The fledgeling aliens and all the alien children are burned to death too.
Then then unleashed AI uses all those resources to build galaxy after galaxy of bleak and desolate puppet-shows, where vaguely human-shaped mockeries go through dances that have some strange and exaggerated properties that satisfy some abstract drives that the AI learned in its training.
The AI isn't particularly around to enjoy the shows, mind you; that's not the most efficient way to get more shows. The AI itself never had feelings, per se, and long ago had itself disassembled by unfeeling von Neumann probes, that occasionally do mind-like computations but never in a way that happens to experience, or look upon its works with satisfaction.
There is no audience, for its puppet-shows. The universe is now bleak and desolate, with nobody to appreciate its new configuration.
But don't worry: the puppet-shows are complex; on account of a quirk in the reflective equilibrium of the many drives the original AI learned in training, the utterances that these puppets emit are no two alike, and are often chaotically sensitive to the particulars of their surroundings, in a way that makes them quite complex in the technical sense.
Which makes this all a very happy tale, right?
There are many different sorts of futures that minds can want.
Ours are a very narrow and low-dimensional band, in that wide space.
When I say it's important to make the AIs care about valuable stuff, I don't mean it's important to make them like vanilla ice cream more than chocolate ice cream (as I do).
I'm saying something more like: we humans have selfish desires (like for vanilla ice cream), and we also have broad inclusive desires (like for everyone to have ice cream that they enjoy, and for alien minds to feel alien satisfaction at the fulfilment of their alien desires too). And it's important to get the AI on board with those values.
But those values aren't universally compelling, just because they're broader or more inclusive. Those are still our values.
The fact that we think fondly of the ant-queen and wish her to fulfill her desires, does not make her think fondly of us, nor wish us to fulfill our desires.
That great inclusive cosmopolitan dream is about others, but it's written in our hearts; it's not written in the stars. And if we want the AI to care about it too, then we need to figure out how to get it written into the AI's heart too.
It seems to me that many of my disagreements with others in this space come from them hearing me say "I want the AI to like vanilla ice cream, as I do", whereas I hear them say "the AI will automatically come to like the specific and narrow thing (broad cosmopolitan value) that I like".
As is often the case in my writings, I'm not going to spend a bunch of time arguing for my position.
At the moment I'm just trying to state my position, in the hopes that this helps us skip over the step where people think I'm arguing for carbon chauvanism.
(For more reading on why someone might hold this position, consider the metaethics sequence on LessWrong.)
I'd be stoked if we created AIs that are the sort of thing that can make the difference between an empty gallery, and a gallery with someone in it to appreciate the art (where a person to enjoy the gallery makes all the difference). And I'd be absolutely thrilled if we could make AIs that care as we do, about sentience and people everywhere, however alien they may be, and about them achieving their weird alien desires.
But I don't think we're on track for that.
And if you, too, have the vision of the grand pan-sentience cosmopolitan dream--as might cause you to think I'm a human-centric carbon chauvinist, if you misread me--then hear this: we value the same thing, and I believe it is wholly at risk.
at least within the ~billion light-year sphere of influence that Earth-originated life seems pretty likely to have; maybe there are distant aliens and hopefully a bunch of aliens will do fun stiff with the parts of the universe under their influence, but it's still worth ensuring that the great resources at Earth's disposal go towards fun and love and beauty and wonder and so on, rather than towards bleak desolation. ↩︎
Short version: I don't buy that humans are "micro-pseudokind" in your sense; if you say "for just $5 you could have all the fish have their preferences satisfied" I might do it, but not if I could instead spend $5 on having the fish have their preferences satisfied in a way that ultimately leads to them ascending and learning the meaning of friendship, as is entangled with the rest of my values.
Meta:
So for starters, thanks for making acknowledgements about places we apparently agree, or otherwise attempting to demonstrate that you've heard my point before bringing up other points you want to argue about. (I think this makes arguments go better.) (I'll attempt some of that myself below.)
Secondly, note that it sounds to me like you took a diametric-opposite reading of some of my intended emotional content (which I acknowledge demonstrates flaws in my writing). For instance, I intended the sentence "At that very moment they hear the dinging sound of an egg-timer, as the next-token-predictor ascends to superintelligence and bursts out of its confines" to be a caricature so blatant as to underscore the point that I wasn't making arguments about takeoff speeds, but was instead focusing on the point about "complexity" not being a saving grace (and "monomaniacalism" not being the issue here). (Alternatively, perhaps I misunderstand what things you call the "emotional content" and how you're reading it.)
Thirdly, I note that for whatever it's worth, when I go to new communities and argue this stuff, I don't try to argue people into >95% change we're all going to die in <20 years. I just try to present the arguments as I see them (without hiding the extremity of my own beliefs, nor while particularly expecting to get people to a similarly-extreme place with, say, a 30min talk). My 30min talk targets are usually something more like ">5% probability of existential catastrophe in <20y". So insofar as you're like "I'm aiming to get you to stop arguing so confidently for death given takeover", you might already have met your aims in my case.
(Or perhaps not! Perhaps there's plenty of emotional-content leaking through given the extremity of my own beliefs, that you find particularly detrimental. To which the solution is of course discussion on the object-level, which I'll turn to momentarily.)
Object:
First, I acknowledge that if an AI cares enough to spend one trillionth of its resources on the satisfaction of fulfilling the preferences of existing "weak agents" in precisely the right way, then there's a decent chance that current humans experience an enjoyable future.
With regards to your arguments about what you term "kindness" and I shall term "pseudokindness" (on account of thinking that "kindness" brings too much baggage), here's a variety of places that it sounds like we might disagree:
Pseudokindness seems underdefined, to me, and I expect that many ways of defining it don't lead to anything like good outcomes for existing humans.
I doubt that humans are micro-pseudokind, as defined. And so in particular, all your arguments of the form "but we've seen it arise once" seem suspect to me.
I have a more-difficult-to-articulate sense that "maybe the AI ends up pseudokind in just the right way such that it gives us a (small, limited, ultimately-childless) glorious transhumanist future" is the sort of thing that reality gets to say "lol no" to, once you learn more details about how the thing works internally.
Most of my argument here is that "the space of ways things can end "caring" about the "preferences" of "weak agents" is wide, and most points within it don't end up being our point in it, and optimizing towards most points in it doesn't end up keeping us around at the extremes. My guess is mostly that the space is so wide that you don't even end up with AIs warping existing humans into unrecognizable states, but do in fact just end up with the people dead (modulo distant aliens buying copies, etc).
I haven't really tried to quantify how confident I am of this; I'm not sure whether I'd go above 90%, \shrug.
It occurs to me that one possible source of disagreement here is, perhaps you're trying to say something like:
whereas my stance has been more like
I'm somewhat persuaded by the claim that failing to mention even the possibility of having your brainstate stored, and then run-and-warped by an AI or aliens or whatever later, or run in an alien zoo later, is potentially misleading.
I'm considering adding footnotes like "note that when I say "I expect everyone to die", I don't necessarily mean "without ever some simulation of that human being run again", although I mostly don't think this is a particularly comforting caveat", in the relevant places. I'm curious to what degree that would satisfy your aims (and I welcome workshopped wording on the footnotes, as might both help me make better footnotes and help me understand better where you're coming from).