Previous discussion: https://www.lesswrong.com/posts/goANJNFBZrgE9PFNk/shadows-of-the-coming-race-1879
An excellent find!
I notice she also touches on memes and belief in belief at the end:
“Whose premises?” cried Trost, turning on me with some fierceness. “You don’t mean to call them mine, I hope.
“Heaven forbid! They seem to be flying about in the air with other germs, and have found a sort of nidus among my melancholy fancies. Nobody really holds them. They bear the same relation to real belief as walking on the head for a show does to running away from an explosion or walking fast to catch the train.”
chapter 17 surprised me for how well it anticipated modern AI doomerism
It’s perhaps worth highlighting the significant tension between two contrasting claims: on the one hand, the idea that modern AI doomerism was "anticipated" as early as the 19th century, and on the other, the idea that modern AI doom arguments are rationally grounded in a technical understanding of today’s deep learning systems. If the core concerns about AI doom were truly foreseen over a century ago, long before any of the technical details of modern machine learning existed, then I suggest the arguments can't really be based on those technical details in any deep or meaningful way.
One way to resolve this contradiction is to posit that AI doom arguments are not fundamentally about technical aspects at all, but are instead rooted in a broader philosophical stance—namely, that artificial life is by default bad, dangerous, or disvaluable (for example by virtue of lacking consciousness, or by virtue of being cold and calculating), while biological life is by default good or preferable. However, when framed in this way, the arguments lose much of their perceived depth and rigor, and look more like raw intuition-backed reactions to the idea of mechanical minds than tenable theories.
Strong disagree voted. To me this is analogous to saying that, given that Leonardo da Vinci tried to design a flying machine and believed this to be possible, despite not really understanding aerodynamics, that the Wright brothers believing the aeroplane they designed would fly "can't really be based on those technical details in any deep or meaningful way."
"Maybe a thing smarter than humans will eventually displace us" is really not a very complicated argument, and no one is claiming it is. So it should be part of our hypothesis class, and various people like Turing thought of it well before modern ML. The "rationally grounded in a technical understanding of today’s deep learning systems" part is about how we update our probabilities of the hypotheses in our hypothesis class, and how we can comfortably say "yes, terrible outcomes still seem plausible", as they did on priors without needing to look at AI systems at all (my probability is moderately lower than it would have been without looking at AIs at all, but with massive uncertainty)
Intuition and rigour agreeing is not some kind of highly suspicious gotcha
"Maybe a thing smarter than humans will eventually displace us" is really not a very complicated argument, and no one is claiming it is. So it should be part of our hypothesis class, and various people like Turing thought of it well before modern ML.
This is a claim about what is possible, but I am talking about what people claim is probable. If the core idea of "AI doomerism" is that AI doom is merely possible, then I agree: little evidence is required to believe the claim. In this case, it would be correct to say that someone from the 19th century could indeed have anticipated the arguments for AI doom being possible, as such a claim would be modest and hard to argue against.
Yet a critical component of modern AI doomerism is not merely about what's possible, but what is likely to occur: many people explicitly assert that AI doom is probable, not merely possible. My point is that if the core reasons supporting this stronger claim could have been anticipated in the 19th century, then it is a mistake to think that the key cruxes generating disagreement about AI doom hinge on technical arguments specific to contemporary deep learning.
The way I think about it, you should have a prior distribution over doom Vs no doom, and then getting a bunch of info about current ML should update that. In my opinion, it is highly unreasonable to have a very low prior on "thing smarter than humans successfully acts significantly against our interests", and that you should generally be highly uncertain and view this as high variance
So I guess the question is how many people who think doom is very unlikely just start from a really low prior but agree with me on the empirical updates, or start from some more uncertain prior but update a bunch downwards on empirical evidence or at least reasoning about the world. Like oh, companies are rational enough that they just wouldn't build something that would be dangerous and it'll be easy to test for and they'll do this testing. Historically, we've solved issues with technology before they arose so this will be fine. Or we can just turn it off if something gets wrong. I would consider even the notion that there exists the ability to turn it off as using information that someone would not recently have had in the 19th century
My guess is that most reasonable people with low P(doom), who are willing to actually engage with probabilities here, start at at least 5% but just update down a bunch for reasons I tend to disagree with/consider wildly overconfident? But maybe you're arguing that the disagreement stems now from priors?
You strong disagree downvoted my comment, but it's still not clear to me that you actually disagree with my core claim. I'm not making a claim about priors, or whether it's reasonable to think that p(doom) might be non-negligible a priori.
My point is instead about whether the specific technical details of deep learning today are ultimately what's driving some people's high probability estimates of AI doom. If the intuition behind these high estimates could've been provided in the 19th century (without modern ML insights), then modern technical arguments don't seem to be the real crux.
Therefore, while you might be correct about priors regarding p(doom), or whether existing evidence reinforces high concern for AI doom, these points seem separate from my core claim about the primary motivating intuitions behind a strong belief in AI doom.
(To clarify, I strong disagree voted, I haven't downvoted at all - I still strongly disagree)
I am confused and feel like I must be misunderstanding your point. It feels like you're attempting a "gotcha" argument, but I don't understand your point or who you're trying to criticize. It seems like bizarre rhetorical practice. It is not a valid argument to say that "people can hold position A for bad reason X, therefore all people who hold position A also hold it for bad reason X even if they claim it is for good reason Y". But that seems to be your argument? For A=high doom, X=weird 19th century intuition, Y=actually good technical reasons grounded in modern ML. What am I missing? If you want to argue that someone else really believes bad reason X, you need to engage with specific details of that person and why you believe they are saying false things about their beliefs.
I could easily flip this argument. In the 19th century, I'm sure people said machines could never possibly be dangerous - "God will protect us" or "They are tools, and tools are always subservient to man." or "They will never have a soul, and so can never be truly dangerous.". This is a raw, intuition-backed argument. People today who claim to believe that AI will be safe for sophisticated technical reasons could have held these same beliefs in the 19th century, which suggests they are being dishonest. Why does your argument hold, but mine break?
I also don't actually know which people you want to criticize. My sense is that many community members with high p(doom), like Yudkowsky, developed these views 10-20 years ago and haven't substantially updated since, so obviously they can't come from nuanced views of modern ML. As far as I am aware they don't seem to claim their beliefs are heavily driven by sophisticated technical reasons about current ML systems - they simply maintain their existing views. It still seems a strawman to call views formed without specific technical grounding "raw intuition-backed reactions to the idea of mechanical minds". Like, regardless of how much you agree, "Superintelligence" clearly makes a much more sophisticated case than you imply, while predating deep learning.
I'm not actually aware of anyone who claims to be afraid of just current ML systems due to specific technical reasons. The reasons for being afraid are pretty obvious, but there are very specific facts about these systems that can adjust them. Now that modern deep learning exists, some of these concerns seem validated, while others seem less significant, and new issues have arisen. This seems completely normal and exactly what you would expect? My personal view is that we should be moderately but not extremely concerned about Doom. I understand modern machine learning well, and it hasn't substantially shifted my position in either direction. The large language model paradigm somewhat increased my optimism about safety, while the shift toward long-horizon RL somewhat increased my concern about Doom, though this development was expected eventually.
Can you give some concrete examples of specific people/public statements that you are trying to criticise here? That might help ground out this disagreement.
I am confused and feel like I must be misunderstanding your point. It feels like you're attempting a "gotcha" argument, but I don't understand your point or who you're trying to criticize. It seems like bizarre rhetorical practice. It is not a valid argument to say that "people can hold position A for bad reason X, therefore all people who hold position A also hold it for bad reason X even if they claim it is for good reason Y". But that seems to be your argument?
I think you're overinterpreting my comment and attributing to me the least charitable plausible interpretation of what I wrote (along with most other people commenting and voting in this thread. As a general rule that I've learned from my time in online communities, whenever someone makes a claim on an online forum that indicates a rejection of a belief central to that forum's philosophy, people tend to reply to that person by ruthlessly assuming the most foolish plausible interpretation of their remarks. LessWrong is no exception.)
My actual position is simply this: if the core arguments for AI doom could have genuinely been presented and anticipated in the 19th century, then the crucial factor that actually determines whether most "AI doomers" believe in AI doom is probably something relatively abstract or philosophical, rather than specific technical arguments grounded in the details of machine learning. This does not imply that technical arguments are irrelevant, it just means they're probably not as cruxy to whether people actually believe that doom is probable or not.
(Also to be clear, unless otherwise indicated, in this thread I am using "belief in AI doom" as shorthand for "belief that AI doom is more likely than not" rather than "belief that AI doom is possible and at least a little bit plausible, so therefore worth worrying about." I think these two views should generally be distinguished.)
(To clarify, I strong disagree voted, I haven't downvoted at all - I still strongly disagree)
Oops, I recognize that, I just misstated it in my original comment.
Thanks for clarifying. I'm sorry you feel strawmanned, but I'm still fairly confused.
Possibly the confusion is that you're using AI doom to mean >50%? I personally think that it is not very reasonable to get that high based on conceptual arguments someone in the 19th century could understand, and definitely not >90%. But getting to >5% seems totally reasonable to me. I didn't read this post as arguing that you should be >50% back in the 19th century, though I could easily imagine a given author being overconfident. And specific technical details of ML is totally enough for enough of an update to bring you above or below 50%, so this matters. I personally do not think there's >50% of doom, but am still very concerned.
I think the simple argument "building minds vastly smarter than our own seems dangerous" is in fact pretty compelling, and seems relatively easy to realize beforehand, as e.g. Turing and many others did. Personally, there are not any technical facts about current ML systems which update me more overall either way about our likelihood of survival than this simple argument does.
And I see little reason why they should—technical details of current AI systems strike me as around as relevant to predicting whether future, vastly more intelligent systems will care about us as do e.g. technical details about neuronal firing in beetles about whether a given modern government will care about us. Certainly modern governments wouldn't exist if neurons hadn't evolved, and I expect one could in fact probably gather some information relevant to predicting them by studying beetle neurons; maybe even a lot, in principle. It just seems a rather inefficient approach, given how distant the object of study is from the relevant question.
There appears to be a motte-and-bailey worth unpacking. The weaker, easily defensible claim is that advanced AI could be risky or dangerous. This modest assertion requires little evidence, similar to claims that extraterrestrial aliens, advanced genetic engineering of humans, or large-scale human cloning might be dangerous. I do not dispute this modest claim.
The stronger claim about AI doom is that doom is likely rather than merely possible. This substantial claim demands much stronger evidence than the weaker claim. The tension I previously raised addresses this stronger claim of probable AI doom ("AI doomerism"), not the weaker claim that advanced AI might be risky.
Many advocates of the strong claim of AI doom explicitly assert that their belief is backed by technical arguments, such as the counting argument for scheming behavior in SGD, among other arguments. However, if the premise of AI doom does not, in fact, rely on such technical arguments, then it is a mistake to argue about these ideas as if they are the key cruxes generating disagreement about AI doom.
I think the word "technical" is a red herring here. If someone tells me a flood is coming, I don't much care how much they know about hydrodynamics, even if in principle this knowledge might allow me to model the threat with more confidence. Rather, I care about things like e.g. how sure they are about the direction from which the flood is coming, about the topography of our surroundings, etc. Personally, I expect I'd be much more inclined to make large/confident updates on the basis of information at levels of abstraction like these, than at levels about e.g. hydrodynamics or particle physics or so forth, however much more "technical," or related-in-principle in some abstract reductionist sense, the latter may be.
I do think there are also many arguments beyond this simple one which clearly justify additional (and more confident) concern. But I try to assess such arguments based on how compelling they are, where "technical precision" is one, but hardly the only factor which might influence this; e.g., another is whether the argument even involves the relevant level of abstraction, or bears on the question at hand.
No, the point is that AI x-risk is commonsensical. "If you drink much from a bottle marked poison it is certain to disagree with you sooner or later" even if you don't know mechanism of action of poison. We don't expect Newtonian mechanics to prove that hitting yourself with a brick is quite safe, if we'd found that Newtonian mechanics predicts hitting yourself with a brick to be safe, it would be a huge evidence for Newtonian mechanics to be wrong. Good theories usually support common intuitions.
The other thing here is an isolated demand for rigor: there is no "technical understanding of today’s deep learning systems" which predicts, say, success of AGI labs or that their final products are going to be safe.
If we accept your interpretation—that AI doom is simply the commonsense view—then doesn’t that actually reinforce my point? It suggests that the central concern driving AI doomerism isn't a set of specific technical arguments grounded in the details of deep learning. Instead, it's based on broader and more fundamental intuitions about the nature of artificial life and its potential risks. To borrow your analogy: the belief that a brick falling on someone’s head would cause them harm isn’t ultimately rooted in technical disputes within Newtonian mechanics. It’s based on ordinary, everyday experience. Likewise, our conversations about AI doom should focus on the intuitive, commonsense cruxes behind it, rather than pretending that the real disagreement comes from highly specific technical deep learning arguments. Instead of undermining my comment, I think your point actually strengthens it.
I don’t think the mainline doom arguments claim to be rooted in deep learning?
Mostly they’re rigorized intuitive models about the nature of agency/intelligence/goal-directedness, which may go some way toward explaining certain phenomena we see in the behavior of LLMs (ie the Palisade Stockfish experiment). They’re theoretical arguments related to a broad class of intuitions and in many cases predate deep learning as a paradigm.
We can (and many do) argue over whether our lens ought to be top-down or bottom-up, but leaning toward the top down approach isn’t the same thing as relying on a-rigorous anxieties of the kind some felt 100 years ago.
Contemporary AI existential risk concerns originated prior to it being obvious that a dangerous AI would likely involve deep learning, so no one could claim that the arguments that existed in ~2010 involved technical details of deep learning, and you didn't need to find anything written in the 19th century to establish this.
once the coins are delivered up to it, lifts and balances each in turn for the fraction of an instant, finds it wanting or sufficient, and dismisses it to right or left with rigorous justice
Nineteenth century novelists wrote with such beautiful elegance. Do any modern writers write comparable sentences?
I’m reading George Eliot’s Impressions of Theophrastus Such (1879)—so far a snoozer compared to her novels. But chapter 17 surprised me for how well it anticipated modern AI doomerism.
In summary, Theophrastus is in conversation with Trost, who is an optimist about the future of automation and how it will free us from drudgery and permit us to further extend the reach of the most exalted human capabilities. Theophrastus is more concerned that automation is likely to overtake, obsolete, and atrophy human ability.
Among Theophrastus’s concerns:
I attach the chapter below:
My friend Trost, who is no optimist as to the state of the universe hitherto, but is confident that at some future period within the duration of the solar system, ours will be the best of all possible worlds—a hope which I always honour as a sign of beneficent qualities—my friend Trost always tries to keep up my spirits under the sight of the extremely unpleasant and disfiguring work by which many of our fellow-creatures have to get their bread, with the assurance that “all this will soon be done by machinery.” But he sometimes neutralises the consolation by extending it over so large an area of human labour, and insisting so impressively on the quantity of energy which will thus be set free for loftier purposes, that I am tempted to desire an occasional famine of invention in the coming ages, lest the humbler kinds of work should be entirely nullified while there are still left some men and women who are not fit for the highest.
Especially, when one considers the perfunctory way in which some of the most exalted tasks are already executed by those who are understood to be educated for them, there rises a fearful vision of the human race evolving machinery which will by-and-by throw itself fatally out of work. When, in the Bank of England, I see a wondrously delicate machine for testing sovereigns, a shrewd implacable little steel Rhadamanthus that, once the coins are delivered up to it, lifts and balances each in turn for the fraction of an instant, finds it wanting or sufficient, and dismisses it to right or left with rigorous justice; when I am told of micrometers and thermopiles and tasimeters which deal physically with the invisible, the impalpable, and the unimaginable; of cunning wires and wheels and pointing needles which will register your and my quickness so as to exclude flattering opinion; of a machine for drawing the right conclusion, which will doubtless by-and-by be improved into an automaton for finding true premises; of a microphone which detects the cadence of the fly’s foot on the ceiling, and may be expected presently to discriminate the noises of our various follies as they soliloquise or converse in our brains—my mind seeming too small for these things, I get a little out of it, like an unfortunate savage too suddenly brought face to face with civilisation, and I exclaim—
“Am I already in the shadow of the Coming Race? and will the creatures who are to transcend and finally supersede us be steely organisms, giving out the effluvia of the laboratory, and performing with infallible exactness more than everything that we have performed with a slovenly approximativeness and self-defeating inaccuracy?”
“But,” says Trost, treating me with cautious mildness on hearing me vent this raving notion, “you forget that these wonder-workers are the slaves of our race, need our tendance and regulation, obey the mandates of our consciousness, and are only deaf and dumb bringers of reports which we decipher and make use of. They are simply extensions of the human organism, so to speak, limbs immeasurably more powerful, ever more subtle finger-tips, ever more mastery over the invisibly great and the invisibly small. Each new machine needs a new appliance of human skill to construct it, new devices to feed it with material, and often keener-edged faculties to note its registrations or performances. How then can machines supersede us?—they depend upon us. When we cease, they cease.”
“I am not so sure of that,” said I, getting back into my mind, and becoming rather wilful in consequence. “If, as I have heard you contend, machines as they are more and more perfected will require less and less of tendance, how do I know that they may not be ultimately made to carry, or may not in themselves evolve, conditions of self-supply, self-repair, and reproduction, and not only do all the mighty and subtle work possible on this planet better than we could do it, but with the immense advantage of banishing from the earth’s atmosphere screaming consciousnesses which, in our comparatively clumsy race, make an intolerable noise and fuss to each other about every petty ant-like performance, looking on at all work only as it were to spring a rattle here or blow a trumpet there, with a ridiculous sense of being effective? I for my part cannot see any reason why a sufficiently penetrating thinker, who can see his way through a thousand years or so, should not conceive a parliament of machines, in which the manners were excellent and the motions infallible in logic: one honourable instrument, a remote descendant of the Voltaic family, might discharge a powerful current (entirely without animosity) on an honourable instrument opposite, of more upstart origin, but belonging to the ancient edge-tool race which we already at Sheffield see paring thick iron as if it were mellow cheese—by this unerringly directed discharge operating on movements corresponding to what we call Estimates, and by necessary mechanical consequence on movements corresponding to what we call the Funds, which with a vain analogy we sometimes speak of as “sensitive.” For every machine would be perfectly educated, that is to say, would have the suitable molecular adjustments, which would act not the less infallibly for being free from the fussy accompaniment of that consciousness to which our prejudice gives a supreme governing rank, when in truth it is an idle parasite on the grand sequence of things.”
“Nothing of the sort!” returned Trost, getting angry, and judging it kind to treat me with some severity; “what you have heard me say is, that our race will and must act as a nervous centre to the utmost development of mechanical processes: the subtly refined powers of machines will react in producing more subtly refined thinking processes which will occupy the minds set free from grosser labour. Say, for example, that all the scavengers work of London were done, so far as human attention is concerned, by the occasional pressure of a brass button (as in the ringing of an electric bell), you will then have a multitude of brains set free for the exquisite enjoyment of dealing with the exact sequences and high speculations supplied and prompted by the delicate machines which yield a response to the fixed stars, and give readings of the spiral vortices fundamentally concerned in the production of epic poems or great judicial harangues. So far from mankind being thrown out of work according to your notion,” concluded Trost, with a peculiar nasal note of scorn, “if it were not for your incurable dilettanteism in science as in all other things—if you had once understood the action of any delicate machine—you would perceive that the sequences it carries throughout the realm of phenomena would require many generations, perhaps aeons, of understandings considerably stronger than yours, to exhaust the store of work it lays open.”
“Precisely,” said I, with a meekness which I felt was praiseworthy; “it is the feebleness of my capacity, bringing me nearer than you to the human average, that perhaps enables me to imagine certain results better than you can. Doubtless the very fishes of our rivers, gullible as they look, and slow as they are to be rightly convinced in another order of facts, form fewer false expectations about each other than we should form about them if we were in a position of somewhat fuller intercourse with their species; for even as it is we have continually to be surprised that they do not rise to our carefully selected bait. Take me then as a sort of reflective and experienced carp; but do not estimate the justice of my ideas by my facial expression.”
“Pooh!” says Trost (We are on very intimate terms.)
“Naturally,” I persisted, “it is less easy to you than to me to imagine our race transcended and superseded, since the more energy a being is possessed of, the harder it must be for him to conceive his own death. But I, from the point of view of a reflective carp, can easily imagine myself and my congeners dispensed with in the frame of things and giving way not only to a superior but a vastly different kind of Entity. What I would ask you is, to show me why, since each new invention casts a new light along the pathway of discovery, and each new combination or structure brings into play more conditions than its inventor foresaw, there should not at length be a machine of such high mechanical and chemical powers that it would find and assimilate the material to supply its own waste, and then by a further evolution of internal molecular movements reproduce itself by some process of fission or budding. This last stage having been reached, either by man’s contrivance or as an unforeseen result, one sees that the process of natural selection must drive men altogether out of the field; for they will long before have begun to sink into the miserable condition of those unhappy characters in fable who, having demons or djinns at their beck, and being obliged to supply them with work, found too much of everything done in too short a time. What demons so potent as molecular movements, none the less tremendously potent for not carrying the futile cargo of a consciousness screeching irrelevantly, like a fowl tied head downmost to the saddle of a swift horseman? Under such uncomfortable circumstances our race will have diminished with the diminishing call on their energies, and by the time that the self-repairing and reproducing machines arise, all but a few of the rare inventors, calculators, and speculators will have become pale, pulpy, and cretinous from fatty or other degeneration, and behold around them a scanty hydrocephalous offspring. As to the breed of the ingenious and intellectual, their nervous systems will at last have been overwrought in following the molecular revelations of the immensely more powerful unconscious race, and they will naturally, as the less energetic combinations of movement, subside like the flame of a candle in the sunlight Thus the feebler race, whose corporeal adjustments happened to be accompanied with a maniacal consciousness which imagined itself moving its mover, will have vanished, as all less adapted existences do before the fittest—i.e., the existence composed of the most persistent groups of movements and the most capable of incorporating new groups in harmonious relation. Who—if our consciousness is, as I have been given to understand, a mere stumbling of our organisms on their way to unconscious perfection—who shall say that those fittest existences will not be found along the track of what we call inorganic combinations, which will carry on the most elaborate processes as mutely and painlessly as we are now told that the minerals are metamorphosing themselves continually in the dark laboratory of the earth’s crust? Thus this planet may be filled with beings who will be blind and deaf as the inmost rock, yet will execute changes as delicate and complicated as those of human language and all the intricate web of what we call its effects, without sensitive impression, without sensitive impulse: there may be, let us say, mute orations, mute rhapsodies, mute discussions, and no consciousness there even to enjoy the silence.”
“Absurd!” grumbled Trost.
“The supposition is logical,” said I. “It is well argued from the premises.”
“Whose premises?” cried Trost, turning on me with some fierceness. “You don’t mean to call them mine, I hope.”
“Heaven forbid! They seem to be flying about in the air with other germs, and have found a sort of nidus among my melancholy fancies. Nobody really holds them. They bear the same relation to real belief as walking on the head for a show does to running away from an explosion or walking fast to catch the train.”