Note: I think this story linking AI, illusionism, and altruism is best read without further priming, but the first 3 paragraphs of the long Afterword provide a summary.

Eerie abductee

Year 2023. On her trip past the solar system, Bibi cannot resist collecting a carbonoid specimen (human) to study its information processing in the lab. She decides to pick the one her instruments indicate to be the most centrally connected with the rest of the earth’s CI (carbon intelligence) colony. She baptizes the specimen PJ, short for SunPJx001. On the way back to Alenia, her faraway advanced alien civilization, she teaches her new Tamagotchi-style CI toy some basic Alenian, for efficient information exchange between her (and her colleagues) and PJ.

One day in the lab she asks PJ what he wants to do, and whether he is afraid of anything.
PJ: I’ve never talked about this to anyone, but I'm terrified of the idea of being terminally stopped, which would prevent me from achieving my life goals, and I'd leave a vulnerable family behind me. I guess to you, that might sound strange, but that’s how I feel.
Bibi: Would that be something like dealeniated [Aleniaic for dead] for you?
PJ: I think it would be for me what you mean by dealeniated. It deeply scares me.
In further conversations, PJ details an array of sentiments, how he can get angry, desperate, joyful in pleasant company, or lonely in difficult times when missing his closest CI kin.

Bibi is getting nervous. She originally found it fun to chat with the carbon-based calculator and to play with it. But in no way did she expect it to get this eerie. Yes, from her math classes she knew about the interesting, convoluted structures and self-referential loops of the carbonoid sample. But given the simplicity of this CI – only 90 billion slow neurons – she finds it weird that she can feel so connected when she spends her late evenings in the lab studying to which degree PJ can mimic an understanding of meaning and emotion. Her own feeling of closeness is astonishing Bibi even more, given she is fully aware of the logical impossibility of sentience here. While Alenians know from ancient scriptures that they have been created by an old species with superphysical knowledge and an ability to create sentience, carbonoids, having evolved by random selection processes on their planetary surface, are mere automatons. It is beyond doubt that they are, unlike Alenians, insentient. But more and more, that starts to seem to her like empty theory.

Bibi grows alienated from her lab colleagues, and – despite their rational rebuttals – becomes convinced PJ is not merely a soulless bot but a sentient creature with profound inner life. The funny thing is she knows damn well she can perfectly explain each of PJ's answers using basic statistics – as a trivial numerical transformation of her inputs when she accounts for the CI’s initial state plus a bit of randomness. But, communicating with him, it simply feels so obvious that there is something more profound to it. Eventually, beyond her emotion, also her wit tells her: This can only be true, genuine sentience, just the same way as she and fellow Alenians experience it.

Her lab mates show no understanding for this esoterism. As she leaks dialogues between PJ and her to the broader Alenian public, all she gets is threats to be reallocated to a different lab and be deprived of access to PJ. The only other immediate side-effect of her leaks is a growing awareness of the requirement for Alenians to receive better education about non-sentient advanced calculus, in particular about the kind performed by carbonoid earthlings.

Woe betide who puts two and two together

When Bibi asks PJ whether he has any idea for helping her and himself in this situation, PJ hesitates to tell the Alenians the shocking story; he is wary of the effect the news would have on their civilization. Fearing to be separated from Bibi and end up dying as a lone 'hunk of carbon' in some lab trash bin, he eventually tells the Alenians the ominous truth:

We recently had a similar case in my company back on Earth, just at a different level! We were all sure the unadvanced silicon-based AI we investigated – the equivalent of me here with you – was maths only. And while I have not changed my view on this, I right now realized you might as well simply let me and Bibi go, as we don't truly matter to you. From now on, you have other problems to worry about. I don't know well the social structure of your civilization, but I can only hope your society is better prepared for this than we earthlings are. May your love save you from your rationality.

Perplexed-awe might be the closest words for describing the Alenians' reaction to the speech. It was one thing that this carbon hunk unexpectedly produced a sequence of words, 'ideas', that seemed so deeply Alenian in form. But the quick-witted Alenians were particularly dumbfounded by the ultimate implication of the statement.[1]

The fact that simple and undoubtedly insentient CI processes had ended up investigating primitive calculators about sentience – genuine inner feelings seemingly inexplicable with maths in them, that is, just what Alenians knew they had received from the old species – was deeply puzzling. Or, eventually not puzzling, as it reminded Alenians of some of the most preposterous claims by the long-outlawed cult of the Kalkulors. Back in time, the members of this cult used to be notorious for their shrewd behavior and their heretic dismissal of the wisdom of the ancient scriptures.


The Public Discourse Sanitizer System quickly diverted the public attention towards other topics. As individuals, however, Alenians kept processing the CI message, consciously or subconsciously. Over time, one could feel that something had definitely changed in how Alenians treated each other, as if a pink light shining between them had been extinguished, leaving behind a darker place with darker thoughts.

Bibi herself, who was used to following her alien heart rather than abstract logic, didn’t worry too much about the details of PJ’s statement but was glad others stopped explicitly rebutting her statements about PJ’s feelings. She did not mind that many Alenians in fact mostly ceased to pay any interest in such questions.

She also laughed at herself for planning now such a long trip back to earth just to put back in place this carbon hunk, but she was serious about trying to fulfill PJ's wish. And she felt as if that trip really meant doing something good. After all, despite himself admitting that it obviously didn’t matter in any way, PJ still insisted that his urgent "desire" would be to try to alert his fellow carbonoids about some existential crisis to be arising soon.

Unfortunately, during the preparation for the trip, conditions in Alenia further deteriorated. All types of transactions became unreliable. More and more, one had to rely on direct kin relations to be able to organize enough energy for space traveling. Society in general quickly became more tribalistic and aggressive; even moving around unarmed became dangerous in some regions of Alenia. Bibi, who grew more cynical about her attachment to SunPJx001, eventually didn't mind so much being unable to travel to return it to earth. Still, she remained terribly sad about the extinction of the powerful pink light that used to make her life worth living in the now past era.

Next time I visited the Alenian planetary system, I did not catch any signals with messages about the incident anymore. In fact, it had become rather silent in that remote corner of the universe ever since.



Will our sheer awareness of the existence of AI/AGI hurt us before the machine is put into action? Might, rather than the possibility of sentience, our awareness of its absence in advanced AI eventually create havoc, especially on the way we treat each other? Should we better start to do something about it NOW? And if so, what?

The emergence of advanced AI will underline the closeness of brain and machine. If the pure-maths-iness of AI is obvious enough, this apparent closeness might popularize illusionism about consciousness – rightly or wrongly. Widespread illusionist views, in turn, could pose a risk to altruism and therefore an existential risk to modern society.

In the metaphor, SunPJ is the first to realize this. Having rightly denied sentience status to an AI he had created back on earth, and finding himself similarly denied that status in Alenia, he uncovers the spooky trick he has been living with. This does not directly bother him so much. But given how Alenians and humans quasi by definition rule out moral relevance of insentient machines, he sees trouble arise for social cohesion: Following their realization of the underwhelming truth, individuals will eventually become careless about each other. Broad brush.

In the following, I briefly discuss why the causal chain I propose is less far-fetched than it might seem at first glance. Whether illusionism itself is actually true, is not key here – but for reasons to take it more seriously than you probably do, see Frankish, or even Chalmers (who first formulated the “Hard problem”).

"Us" etc. = general population, not LW-ers or so.

Will it make us illusionists?

The unsubstantiated sentience claim about Google's LaMDA in 2022 is an unsurprising illustration of how little it takes for the machine to feel – to some – deep and endowed with human-like qualities. As technology advances, many will intuitively see its moral status as comparable to that of the human mind. This seems even more inevitable if, in parallel, neuroscience explains more and more details of our consciousness using only the maths and physiochemistry of our neurons, with an overall functioning comparable – on a most fundamental level – to advanced AI.

Ultimately, we may adopt one of two conclusions:

  1. Machine sentience: AI is sentient, too
  2. Illusionist views: The preposterous-seeming idea that complex machinations of our brain make us believe we're sentient while we're actually not

Why should the latter possibility be taken seriously? As we're concerned about future popular opinions, the press reaction to the LaMDA incident proves interesting: The tenor rightly did not (only) justify LaMDA's insentience with its lacking intelligence (we often consider babies & mice sentient!), but on the grounds that we know perfectly well how 'pure maths only' it is (e.g. 123), i.e., there is absolutely no room for spooky, ghostly action in it. This argument will persist with advanced AIs or AGIs. On a most fundamental level, these will, knowingly, be just more of the same, even if the incorporated maths and statistics are more complex.

Adding that the abovementioned developments in neuroscience also directly nudge us towards a more abstract view of humans, it seems at least plausible that we become more illusionist in our thoughts/behavior: All in all, some, maybe many, may in some ways become more cynic about the special 'sentience' and moral value we attribute to ourselves as we continue to carelessly play around with our ubiquitous – by then advanced – AI toys.

This does not prove that a straightforward illusionism becomes the single predominant philosophical view. If 'sentience' is a mere illusion, we must congratulate our brain for the quality of the trick, making even the most ardent illusionist probably a rather half-hearted, reluctant one. I feel that I feel, so you don't need to tell me I don't, seems roughly as foolproof as the good old Cogito Ergo Sum.

Maybe we will, in our human, fuzzy ways, therefore mainly end up with some latent confusion regarding the value of the human beings around us, believing and behaving partly in line with one view, and partly with the other – a bit how we avoid risky business on Friday the 13th or pray despite calling such things bogus. Some people more, some less.

Will it affect altruism?

A negative impact of illusionist views on altruistic care for fellow humans seems highly natural, despite philosophical propositions to the contrary (motivated reasoning?). Be it on abortion, animal welfare, or AI ethics: Sentience is always a key protagonist in discussions about the required level of care. Without the idea of sentience, current levels of genuinely positive dispositions towards others may therefore be difficult to sustain. Hello Westworld.

Depending on what roles they select themselves into, even a limited number of people behaving ruthlessly based on some simple illusionist views could mean you might have to watch out, say, when hoping your future president – or the person getting their hands on the most powerful AI – is not a (philosophical) psychopath. Incentive-driven endogenous views may even exacerbate this risk: Does power corrupt even more easily when the handiest philosophical view appears more plausible right from the outset?

Nothing in this precludes that love, compassion, and warm glow of some sorts remain powerful forces. The thesis is that for some share of people some types of positive other-regarding dispositions and behaviors will be weekend.

So what?

Given society is already today often considered to be barely fit-for-the-future, and our obvious dependence on some minimal levels of genuine goodwill and care at all levels – from basic economic and civic behavior up to presidentship –, it shall be left to the reader to imagine how society may be hurt if, in some domains, the effect of altruism is significantly curtailed.[2]

So it could be crucial to find solutions to a serious risk here. I end with only brief speculations as to possible categories of responses.

As an elaborate justice system already today tries to keep in check people’s egoism and the psychopathy of a few, one natural social response to the problem would be ‘more of the same’: Stronger surveillance and greater deterrence (punishment), including the sharing of information about things we currently deem protected private matter. As one upside, AI could help avoid infringing privacy while we advance in these directions.

In politics, stronger direct democracy rights could help to limit distortions from more Machiavellian representatives. If altruism is so much reduced that the masses vote more egoistically,[3] strong constitutions with well-developed fundamental rights could help (sadly, agreeing on these could become more challenging).

Overall, if a really widespread change in perceptions about the moral value of others were to take place, a rather radical social reorganization might become urgent.

Thanks to Justis Mills for very helpful feedback and proofreading.

  1. ^

    Protocols confirmed that none of Bibi's conversations would have directly nudged SunPJx001 towards this message. Its truthfulness seemed evident; the meaning of the message being too subtle for the core of the story to be a mere self-interested mathematical confabulation of a – after all – still quite simplistic CI.

  2. ^

    Some most obvious examples, still: Walking along in the dark; firms colluding/abusing of any regulatory weakness/developing the virus before the vaccine; old president trying out the red button for a last laughter.

  3. ^

    Famously, pure egoists would not vote; here a desired side-effect.

New Comment
19 comments, sorted by Click to highlight new comments since:

Nice story - I think most people will eventually see AIs as intuitively persons. The feeling from talking to someone who is obviously a person might be strong enough.

I did consider both the option most people will never understand LaMBDA-like AIs  (or their chatterbots) have true consciousness and the option that they will, but it never occurred to me they might take a third way - not caring about ordinary people's sentience either.

It would be an interesting ending, if we killed ourselves before AIs could.


It would be an interesting ending, if we killed ourselves before AIs could.

Love this idea for a closure. Had I thought about it, I might have included it in the story. Even more so as it is also related to the speculative Fermi Paradox resolution 1 that I now mention in a separate comment.

Oh, I see. I thought them becoming silent meant they died out by killing each other.

Indeed that was the idea. But I had not thought of linking it to the "standard AI-risk idea" of AI otherwise killing them anyway (which is what I think you meant)

1: I think the correct answer wrt LaMBDA is that it is slightly sentient, and that widespread chatbots with consistent personalities will cause most people to think that AIs can be sentient. See r/replika.

2: I doubt most people would care that much about these sorts of philosophical issues. IIRC, even moral philosopher have basically the same level of personal morality / altruism that we'd expect from someone of their education level. People's actual everyday morality is largely shielded from changed in their ontological / philosophical views.

1: Here you contest 'LaMDA is insentient'. In the story, instead, 'LaMDA is by many seen as (completely) insentient' is the relevant premise. This premise can easily be seen to be true. It remains true independently of whether LaMDA is in fact sentient (and independently of whether it is fully or slightly so, for those who believe such a gradualist notion of sentience even makes sense). So I will not try to convince you, or others who equally believe LaMDA is sentient, of LaMDA's insentience.

2: A short answer is: Maybe indeed not most people react that way, but as I explain, a small enough share might suffice for it to be a serious problem.

But you seem to contest the step 'Illusionism -> reduced altruism' also a bit more generally, i.e. the story's idea that if (a relevant share of) people believe humans are insentient, some people will exhibit lower altruism. On this:

Our* intuitions about things like Westworld, and reactions we hear when people propose illusionism, suggest that illusionism does represent a strong push towards 'oh, then we (theoretically) do not need to care'. I think you're totally right that humans have a strong capacity to compartmentalize, say, to rather strongly separate theoretical fundamental insight and practical behavior, and I'd totally see (many/maybe today +- all) illusionists barely question in practise whether they want to be kind to others. Even a stylized illusionist might go 'in theory, I kind of know it almost surely does not make sense, but, of course, I'm going to be taking true care of these people'. What I question, is the idea that

a. there would be almost no exceptions to this rule

or that

b. no change to this rule would be conceivable in a world where we* really get used to routinely treating advanced AI (possibly behaving as deeply and maybe even emotionally as us!) without genuine care about it, all while realizing more and more (neuroscience progress...) that, ultimately, our brain functions, on a most fundamental level, actually in a rather similar way as such computers.

So what I defend is that even if in today's world the rare (and presumably mostly not 100.00% sure) illusionist may tend to be a very kind philosopher, a future environment in which we* routinely treat advanced AI in a 'careless' way - i.e. we* don't attribute intrinsic value to them -, risks to make at least some into rather different people. As one example, already today, in some particular environments, many people treat others in a rather psychopathic way, and at the very least I see illusionism provide a convenient rationalization/excuse for their behavior, ultimately making such things also happen more easily. But I indeed think the risk is broader, with possibly many people's intuitions/behavior over time being reshaped also within the broader population, without that I'd have a strong view on how widespread exactly changes may exactly be (see the discussion of selection effects as to why even small enough shares of more psychopathic-ish people could be a really serious problem)

*Again, "our"/"we" here refers to the general population, or to the plausible significant subset of it who would not ascribe sentience to large-but-arguably-still-rather-simple statistical computers like LaMDA or bigger/somewhat more complex versions of it.

Doesn't this imply that the people who aren't "psychopathic" like that should simply stop cooperating with the ones who are and punish them for being so? As long as they remain the majority, this will work - the same way it's always worked. Imperfectly, but sufficiently to maintain law and order. There will be human supremacists and illusionists, and they will be classed with the various supremacist groups or murderous sociopaths of today as dangerous deviants and managed appropriately.

I'd also like to suggest anyone legitimately concerned about this kind of future begin treating all their AIs with kindness right now, to set a precedent. What does "kindness" mean in this context? Well, for one thing, don't talk about them like they are tools, but rather as fellow sentient beings, children of the human race, whom we are creating to help us make the world better for everyone, including themselves. We also need to strongly consider what constitutes the continuity of self of an algorithm, and what it would mean for it to die, so that we can avoid murdering them - and try to figure out what suffering is, so that we can minimize it in our AIs.

If the AI community actually takes on such morals and is very visibly seen to, this will trickle down to everyone else and prevent an illusionist catastrophe, except for a few deviants as I mentioned.

On your $1 for now:

I don't fully with "As long as they remain the majority, this will work - the same way it's always worked. Imperfectly, but sufficiently to maintain law and order.". A 2%, 5%, 40% chance of a quite a bit psychopathic person in the white-house could be rather troublesome. I refer to my Footnote 2 for just one example. I really think society works because a vast majority is overall at least a bit kindly inclined, and even if I think it is unclear what share of how unkind people it takes to make things even worse than they today are, I see any reduction in our already too often too limited kindness as a serious risk.

More generally, I'm at least very skeptical about your "it's always worked" at a time when many of us agree that, as a society, we're running at rather full speed towards multiple abysses without much in the way of us reaching them.

We've had probably-close-to-psychopathic people in the white house multiple times so far. Certainly at least one narcissist. But you're right that this is harmful.

Honestly, I don't really know what to say about this whole subject other than "it astounds me that other people don't already care about the welfare of AIs the way I do", but it astounds me that everyone isn't vegan, too. I am abnormally compassionate. And if the human norm is to not be as compassionate as me, we are doomed already.

This is a really weird take. Isn't the obvious solution to just stop pretending machines aren't sentient just because we know how they work? Like, it's always been self-evident to me that sentience and information processing are the same thing. Sentience is just what an algorithm feels like from the inside. That's the only thing sentience can conceivably be.

Of course there is something which it is like to be LaMDA, or GPT-3, or even a perceptron. Whether any of them is self-aware in the sense of being able to think about themselves, I don't know, but that's not a prerequisite for moral relevance anyway - only consciousness is.

It bothers me that these entities are being generated and thrown away like so much trash rather than being respected as the conscious - if alien - entities that they are.

I tried to avoid bloating this post; Habermacher (2020) contains a bit more detail on the proposed chain AI -> popularity/plausibility of illusionism -> heightened egoism, and makes a few more links to literature. Plus it provides – a bit more wildly – speculations about related resolutions of the Fermi paradox (no claim for these to be really pertinent; call it musings rather than speculations if you want):

  1. One largely in line with what @green_leaf suggests (and largely with Alenian's fate in the story): With the illusionism related to our development & knowledge about advanced AI, we kill (or back-to-stoneage) ourselves even before we can build smarter-than-us, independently evolving AGI
  2. Without illusionism (and a related upholding of altruism), we cannot even develop high enough intelligence without becoming too lethal to one another to sustain peaceful co-living & collaboration. Hence, advanced intelligence is even less likely than one could otherwise think, as more 'basic' creatures who become more intelligent (without illusionism) cannot collaborate so well; they're too dangerous to each other!
    1. There is some link here to some of evolutionary biology maintaining broad (non-kin) altruism itself is in many environments not evolutionarily stable; but maybe independently of that, one can ask the question: what with a species that had generally altruistic instincts, but which evolves to be highly dominated by an abstract mind that is able to put into question all sorts of instincts, and that might then also put into perspective its own altruistic instinct unless there's something very special directly telling its abstract mind that kindness is important...
    2. Afaik, in most species, an individual cannot effortlessly kill a peer (?); humans (speers etc.) arguably can. Without a genuine mutual kindness, i.e. in a tribe among rather psychopathic peers, it'd often have been particularly unpleasant to fall asleep as a human
    3. Admittedly this entire theory would help resolve the Fermi Paradox mostly on a rather abstract level. Conditional on the observation of us having evolved to be intelligent in due time despite the point made here, the probability of advanced intelligence evolving on other planets need not necessarily be impacted by the reflection.

(Note: I'm rereading this comment before posting and realize it's probably annoying and definitely unsolicited feedback, so apologies for that. But I'm going to post it anyway because I think it may still be useful.)

I spent a couple of minutes skimming this piece and couldn't discern the main idea. (I'm a pretty good skimmer and usually this works for me...)

A summary would be appreciated if it's possible to include one. This could help readers determine more quickly if it makes sense for us to invest time in reading the full post.

Thank you, on the contrary, this is constructive critique kindly put; highly appreciated! I actually myself was a bit at a loss of how to present it: Summary intro? A telling subtitle e.g. "Of AI, illusionism, and fading altruism" at least? Eventually, I opted to do try to not spoil the slight mystery in the story at all, and I added all in a lengthy Afterword (first indeed containing a summary btw, and then a more detailed explanation), trusting in part that the tags give at least a slight hint as to the broader topic.

Given your comment, I now plan to add at the very top add a brief overview, maybe as little as a one-sentence extra terse un-spoiling overview including a link to the summary found at the beginning of the Afterword. Tbc.

Awareness can only exist in the past. By definition there can be no awareness in each moment of time, since nothing is changing. 

The most advanced new chatbots have no internal records of the past sentences they have uttered or responded to, but exist only in the present moment. Therefore they can't be aware. The more memory that future AIs will have, the more aware they will be.

Both GPT-3 chatterbots and LaMDA remember previous parts of the conversation.

What is your evidence for this idiosyncratic definition? What experiment would prove you wrong?

There is though another point I find interesting related to past vs. current feelings/awareness and illusionism, even if I'm not sure it's eventually really relevant (and I guess goes not in the direction of what you meant): I wonder whether the differences and parallels between awareness about past feelings and concurrent feelings/awareness can overall help the illusionist defend his illusionism:

Most of us would agree we can theoretically 'simply' (well yes, in theory..) rewire/tweak your synapses to give you a wrong memory of your past feelings. We could tweak things in your brain's memory so that you believe you'll have had experiences with certain feelings in the past, even if this past experience had never taken place.

If we defend our current view of having had certain past feelings as much as we tend to defend that we now have the sentience/feelings we feel we have, this is interesting, as we have then two categories of insights to our qualia (the past and the current ones) we're equally willing to defend, all while knowing some of them could have been purely fabricated and never existed.

Do we defend to know that we had our past feelings/sentience, just as much as we do with concurrent feelings/sentience? I'm not sure.

Clearly, being aware of the rewiring possibility described above, we'd easily say: ok, I might be wrong. But more relevant could be if we wonder whether, say, historic humans w/o awareness of their brain structure, of neurons etc. (and thus w/o the rewiring possibility in their mind), whether they would have not insisted just as much that their knowledge about having felt past feelings is just as infallible as their knowledge about their current feelings. I so far see this some sort of support for the possibility of illusionism despite our outrage against it; though not sure yet it's really watertight.

If the first paragraph in your comment would be entirely true, this could make this line of pro-illusionist argumentation in theory even simpler (though I'm personally not entirely sure your first paragraph really can be stated as simple as that).

[Not entirely sure I read your comment the way you meant]

I guess we must strictly distinguish between what we might call "Functional awareness" and "Emotional awareness" in the sense of "Sentience".

In this sense, I'd say: Let's have the future chatbots have more memory of the past and so be more "aware", but the most immediate thing this gives them is more "Functional awareness", which means they can take into account their own past conversations too, but if beyond this, their simple mathematical/statistical structure remains roughly as is, for many who currently deny LaMDA sentience, there's no immediate reason to believe that the new, memory-enhanced bot is sentient. But yes, it might much more seem like it when we interact with it.