We've already taken possession of the drugs; now we're just haggling about the price. But our haggling doesn't quite have the commercial savvy one would like. This is bad, because now that worry about existentially threatening AI has gone mainstream, we need to be very clear about what could happen next. Strongly worded letters alluding to the end of the world make a for a cracking good read when they come bundled with the Bible, but we need to do a little better than P(doom|AGI)=high.
In this essay, I will make a set of concrete predictions concerning how AGI will existentially impact the human species, should it be developed in the near- to medium-term. My starting point is that alignment is not, nor ever will be, an outcome we can secure for AGI. However, I do not think this will lead to human extinction in the way that is typically prognosticated. That is, our future will not consist of paperclips, grey goo, or computronium. Instead, we will be be subjected to an extended period of umwelt cultivation that sculpts the space of subjective human values and action dispositions to fit AGI goals. I volunteer this as an existential catastrophe, because it means current human value systems will be entirely superannuated. Though this is not a new idea, existing speculation on its precise details is sparse. My aim here is to flesh out the claim with some detailed predictions that, if nothing else, capture one possible future that emerges from AGI. If my predictions are wrong in some obvious way, so much the better: it'll make the space of potential error just a little smaller.
The basis of my predictions comes from what we already know about 3.5 billion years of evolution by natural selection. Sure, evolution is as dumb as a bag of rocks, but the fact is that natural selection is next-token prediction by another name. That the token is number of second-generation offspring who reach reproductive age is beside the point: there is a de facto objective, an optimisation process, and an application of results outside the training distribution. The trillion parameter superintelligences we carry inside our skulls are just a side effect of this. That's the first mistake we make in thinking about AGI: it's not a new phenomenon––it's just a more effective version of something that's happened already.
What does it mean to have a goal? It means diverting some quantity of negentropy away from the fastest route to thermodynamic equilibrium. When the goal is a short-term function of known parameters, it can usually be achieved by a fully specifiable mechanistic process. When the parameters that contextualise the goal are subject to uncertainty, cognition becomes necessary. This is because any reduction in uncertainty can only be exploited by updating a predictive model that then informs a mechanistic process. Up to now, evolution has solved this problem with bodies and brains: your sensorium is a set of expected values for environmental and internal states that are matched against reality by efferent and afferent updating. When these values are wrong, you go extinct or you get cancer.
How does one get these values wrong? There are two ways: overfitting and noise injection. The evolutionary history of life on Earth is almost nothing except overfitting to specific environmental niches. This is swell for a while, but every environmental niche has a lifespan. Hence all the extinctions. Noise injections occur when self-regulating processes get corrupted. Hence all the cancer. Sure, every so often the cancer catches a break and we overfit to the new environment, but that's just dumb luck that never lasts. The point is that next-token prediction, evolutionary variety, degenerates into worse-than-chance outcomes in the long run when it optimises for success in the short run. (Is there any possible scenario where the Ediacaran biota could survive in today's environment?)
Where these considerations become relevant for AGI is that they hold true for any non-mechanistic goal. Given that there would be no point in providing an AI with a goal that can already be mechanistically specified, this means that the overfitting issue, in particular, is relevant for any form of AGI goal seeking. (To be fair, it'll probably out-think the cancer.) No AGI will be omniscient; the planetary and cosmic environment will always contain some uncertainty that cannot easily be estimated or controlled––and this means overfitting is always a long term risk for goal maximisers. Whatever the best path to filling the light cone with widgets may be, it's unlikely to coincide with the best plan for turning humans into widgets right now. Instrumental convergence means AGI will know this, so your atoms are probably yours to keep in the short term.
Humans are maybe the least environmentally overfitted outcomes of evolution (more on social overfitting later). This is why we've managed to build a planetary civilisation that spans multiple ecological niches. All for only 2,500 calories a day and a bit of neoteny. One consequence is that if you want data about the environment, humans are a good source for it. Not the shallow crap that we express in language, but the 11 million bits per second that come in through the senses. All it takes is to perform a median split on the intensity of each sensory modality and you've already got a 64 character alphabet for encoding the physical world. (Yes, there are six senses.) Chuck in sensory data from other species and you've got a cheap, efficient, data-gathering apparatus that is at once highly versatile, autonomous, and exquisitely calibrated to its planetary environment.
Given that we'll sooner or later run out of data for LLM training runs, this vastly larger reservoir of sensory information will be of immense strategic value for any AGI that needs sensory grounding––which is likely any AGI. (Language, after all, is an R4→R1 homomorphic mapping; sense data lets you keep R4.) While it's certainly possible that AGI could reduplicate the biosphere, that would be extraordinarily inefficient: evolution may be a blind watch-maker, but the watches have been ruthlessly optimised by competition with other watches to work. It's a far more worthwhile proposition to co-opt the biosphere and tweak it where necessary. And remember, the technology for interfacing with the biosphere is already nearly here. Sure, there will be always be some environmental data that falls outside the range of biological processes, but science, baby! has already built half the technical infrastructure for capturing that.
So, we can expect to have utility for AGI in something that resembles our current evolved form; ditto for nature red in tooth and claw. The question then becomes, how will AGI access and exploit the data encoded by the biosphere? In what follows, I will outline some predictions for how this will happen for humans; the considerations can be extended readily enough to other species. Once I have outlined the predictions, I will offer some speculations on how AGI will bring them about.
Have you ever tried to run a company? I have; it's a total shit-show. The reason is that companies do not actually optimise for making money, but for reducing anxiety (Figure 1). Predictable revenue streams are one way to do this, so fat profit margins aren't inconsistent with anxiety reduction, but most anxiety actually comes from uncertainty relating to low-grade interpersonal conflict. The result is that HR departments spring up like Japanese knotweed, 80% of the work done is pointless social signalling, and anyone remotely competent is up for cancelling the second they vibrate the air with a complaint.
Figure 1: Humans do not like anxiety at all. Data from Warriner et al. (2013), rescaled by the author between 0 and 1.
This is, to say the least, an unsatisfactory state of affairs. Termites don't have these problems, but then termites are so overfitted they're defeated by a sticky tongue. The niche we're overfitted to is the pattern of alliance and conflict in the social environment; it's the reason we have a neocortex in the first place. Given that this overfitting is what enabled us to get out of the Olduvai Gorge, we're probably stuck with it. To be simultaneously cooperative and versatile requires a minimum quantum of asshole, and that, dear reader, is the grain that runs through our crooked timber.
Or more accurately, we're stuck with it until AGI emerges. There's already (questionable) evidence that LLMs are capable of theory of mind, so there's no doubt that full AGI would be able to navigate the human social environment. Assuming the utility of humans, the actual question is whether AGI goals would be best served by minimising anxiety or removing it.
By almost any metric, the answer will be by removing it. Companies are expressly optimised for goal achievement, and they're still disastrous at it because of the primate politics. Pretty much every other form of social organisation is an order of magnitude worse. Excising the primate politics is the best kill-switch there is for making humans useful at something that doesn't involve fucking, fighting, or nepotism.
How might an AGI dissolve the ape? The most likely option would be to target the propensity for forming between- and within-group hierarchies. Because we're social, high status means preferential access to resources and mating opportunities. And because status is in principle up for grabs, we use much––perhaps most––of our cognitive resources trying to establish and navigate social hierarchies. Why is this? Because we don't just need to track the mental states of the individual in our environment; we need to track what they think about the mental states of others (Figure 2). This is a massive computational cost, and scales non-linearly with group size (Figure 3). Eliminating this cost would immediately free up these resources by reducing the anxiety that attends goal-directed social interactions. One way would involve freezing social hierarchies, but this would just be another form of overfitting (see: termites). Instead, the following courses of action would make more sense:
Figure 2: Alice and Bob reflect that, on balance, PGP encryption is less complicated than this (© Bronwyn Tarr, 2014).
Figure 3: Number of recursively embedded mental state representations for an individual by group size. This is why you don't make new friends anymore.
I don't know about you, but my inner life has all the finesse of a dog dragging its ass along the carpet. This is because my propensities are split across any number of inconsistent goals which compete with each other in my global workspace. Thank fuck for liquor, I say. AGI will agree on the problem, but be less keen on the cirrhosis. Instead, it is likely to re-weight the allotment of subjective valence across the goals in my environment, consistent with the maximising of its own goals. Part of this will involve reducing the friction between inconsistent human propensities; I have already explored how this might play out for social cognition. But there are several other ways in which the human umwelt can be re-mapped in ways that will advantage AGI.
Figure 4: Temporal discounting rates relative to average life expectancy
Figure 5: Thank you, Satoshi, wherever you are, from the bottom of our dopamine filled synaptic clefts.
I could go on listing other possible ways in which AGI could perform evolution by artificial selection on humans, but my intention here is not to be exhaustive. Instead, it's to point to some ways that, in relatively short time horizons, we may see fundamental aspects of our being-in-the-world shifted towards AGI preferences. The reasonable question now arises of how AGI might bring any of this about. There's a fairly substantial literature on this already (e.g. here), and I don't think I'll do any better in my prognostications. Still, for the sake of completeness, it's worth listing a few possibilities:
Let's assume, for the sake of it, that all or some of the ideas outlined here are true. What should we think about the predicted outcomes? Clearly enough, we're talking about human extinction. Not in the sense of a spatiotemporal discontinuity of the species, but with respect to the cognitive and cultural links that identify us as an ongoing project. We already don't care much about our great-great grandchildren––and these kids? Well, they're not alright. They will be the ants in the ant hive after the Hairstreak caterpillar enters and sings like a queen. We will share the same bodily morphology with them and that will be about it. They won't think about us one way or the other.
What should you do? Take up smoking. Have an affair. Get into organised crime. Climb the greasy poll. Start a revolution. These are all beautiful things and we owe it to them to give them a good send off. It's been a wild ride, and if the lows outnumbered the highs, the highs were very high indeed. The green was so green, the blue was so blue; I was so I, and you were so you.
Of course, I could be wrong; everybody is, most of the time. But even if the details are wonky and the timings are out, ask yourself: do you really think the Upper Palaeolithic will last forever?
I think your post is right about many of the inefficiencies of humans. Note that as inefficient as we are, the current industrial civilization has removed many of them and the current age of cheap online communication and stable and usable devices has already led to large changes. Changes that you can expect over the arrow of time will lead to greater efficiency due to selective pressure on corporations.
There's I think a major error here. Thinking of "AGI" as this singleton massive entity dispassionately planning human lives. It's increasingly unlikely this is the form that AGI will take.
Instead, billions of separate "sessions" of many separate models seems to be the actual form. Each session is a short lived agent that only knows some (prompt, file of prior context). Some of the agents will be superhuman in capabilities, occasionally broadly so, but most far more narrow and specialized. (Because of computation cost and IP cost to use the largest systems on your task)
You can think of an era of billions of humans all separately working on their own goals with these tools as a system that steadily advances the underlying "AGI" technology. As humans win and lose, even losing nuclear wars, the underlying technology gets steadily better and more robust.
So a coevolution, not a singleton planning everything. Over time humans would become more and more machine like themselves as those traits will be the ones rewarded, and more and more of the underlying civilization is there to feed the AGI.(maybe using implants maybe just shifting behavior) Kind of how our current civilization devotes so many resources to feeding vehicles.
I think this is the most probable outcome, taking at least 80 percent of the probability mass. Scenarios of a singleton tiling the universe with boring self copies or a utopia seem unlikely.
Unfortunately it will mean inequality like we can scarcely imagine. Some groups will have all the wealth in the solar system and be immortal, others will receive only what the in power group chooses to share.