Overwhelming Superintelligence

Raemon

LESSWRONG
LW

Overwhelming Superintelligence — LessWrong

79 Overwhelming Superintelligence

by Raemon

1st Jan 2026

1 min read

79

There's many debates about "what counts as AGI" or "what counts as superintelligence?".

Some people might consider those arguments "goalpost moving." Some people were using "superintelligence" to mean "overwhelmingly smarter than humanity". So, it may feel to them like it's watering it down if you use it to mean "spikily good at some coding tasks while still not really successfully generalizing or maintaining focus."

I think there's just actually a wide range of concepts that need to get talked about. And, right now, most of the AIs that people will wanna talk about are kinda general and kinda superintelligent and kinda aligned.

If you have an specific concept you wanna protect, I think it's better to just give it a clunky name that people don't want to use in casual conversation,^[1] rather than pumping against entropy to defend a simple term that could be defined to mean other things.

Previously OpenPhil had used "Transformative AI" to mean "AI that is, you know, powerful enough to radically transform society, somehow." I think that's a useful term. But, it's not exactly what If Anyone Builds It is cautioning about.

The type of AI I'm most directly worried about is "overwhelmingly superhuman compared to humanity." (And, AIs that might quickly bootstrap to become overwhelmingly superhuman).

I've been lately calling that Overwhelming Superintelligence.

Overwhelming Superintelligence is scary both because it's capable of strategically outthinking humanity, and, because any subtle flaws or incompatibilities between what it wants, and what humans want, will get driven to extreme levels.

I think if anyone builds Overwhelming Superintelligence without hitting a pretty narrow alignment target, everyone probably dies. (And, if not, the future is probably quite bad).

Appendix: Lots of "Careful Moderate Superintelligence"

I am separately worried about "Carefully Controlled Moderate Superintelligences that we're running at scale, each instance of which is not threatening, but, we're running a lot of them, giving them lots of room to maneuver."

This is threatening partly because at some point that they may give rise to Overwhelming Superintelligence, but, also because sharing the planet with a slightly smarter species still doesn't seem like it bodes well. (See humans, neanderthals, chimpanzees). They don't have to do anything directly threatening, just keep being very useful while subtly steering things such that they get more power in the future.

^{^}
I actually think AIdon'tkilleveryoneism is pretty good.

Frontpage

79

New Comment

30 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:36 PM

[-]Vanessa Kosoy2mo155

Why are we giving up on plain "superintelligence" so quickly? According to Wikipedia:

A superintelligence is a hypothetical agent that possesses intelligence surpassing that of the most gifted human minds. Philosopher Nick Bostrom defines superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest".

According to Google AI Overview:

Superintelligence (or Artificial Superintelligence - ASI) is a hypothetical AI that vastly surpasses human intellect in virtually all cognitive domains, possessing superior scientific creativity, general wisdom, and social skills, operating at speeds and capacities far beyond human capability, and potentially leading to profound societal transformation or existential risks if not safely aligned with human goals.

I don't think I saw anyone use "superintelligence" as "better than a majority of humans on some specific tasks" before very recently. (Was DeepBlue a superintelligence? Is a calculator superintelligence?)

[-]Davidmanheim2mo80

I think the distinction is between "smarter and more capable than any human" versus "smarter and more capable than humanity as a whole"

The former is what you refer to, which could still be "Careful Moderate Superintelligence" in the view of the post.

[-]Raemon2mo5-1

Partly because I don't think a Superintelligence by that definition is actually, intrinsically, that threatening. I think it is totally possible to build That without everyone dying.

The "It" that is not possible to build without everyone dying is an intelligence that is either overwhelmingly smarter than all humanity, or, a moderate non-superintelligence that is situationally aware with the element of surprise such that it can maneuver to become overwhelmingly smarter than humanity.

I think meanwhile there are good reasons for people to want to talk about various flavors of weak superintelligence, and trying to force them to use some other word for that seems doomed.

[-]Vanessa Kosoy1mo138

There seem to be two underlying motivations here, which are best kept separate.

One motivation is having a good vocabulary to talk about fine-grained distinctions. I'm on board with this one. We might want to distinguish e.g.:

Smarter than a median human along all AI-risk-relevant axes
Smarter than the smartest human along all AI-risk-relevant axes
Smarter than all of humanity put together along all AI-risk-relevant axes
Smart enough to have a 50% success probability to kill all humans if it chooses to, given current level of countermeasures
Smart enough to have a 50% success probability to kill all humans if it chooses to, even if best-case countermeasures are in place (this particular distinction inspired by Buck's comments on this thread)

But then, first, it is clear that existing AI is not superintelligence according to any of the above interpretations. Second, I see no reason not to use catchy words like "hyperintelligence", per One's suggestion. (Although I agree that there is an advantage to choosing more descriptive terms.)

Another motivation is staying ahead of the hype cycles and epistemic warfare on twitter or whatnot. This one I take issue with.

I don't have an account on twitter, and I hope that I never will have. Twisting ourselves into pretzels with ridiculous words like "AIdon'tkilleveryoneism" is incompatible with creating a vocabulary optimized for actually thinking and having productive discussions among people who are trying to be the adults in the room. Let the twitterites use whatever anti-language they want. The people trying to do beneficial politics there: I sincerely wish you luck, but I'm laboring in a different trench, and let's use the proper tool for each task separately.

I understand that there can be practical difficulties such as, what if LW ends up using a language so different from the outside world that it will become inaccessible to outsiders, even when those outsiders would otherwise make valuable contributions. There are probably some tradeoffs that are reasonable to make with such considerations in mind. But let's at least not abandon any linguistic position at the slightest threatening gesture of the enemy.

[-]Raemon1mo20

Two categories that don't quite match the ones you laid out here.

I think there is something like "being a good citizen when trying to create jargon." Don't pick a word that everyone will predictably misunderstand, or will predictably really want to use for some other more common thing, if you want to also be able to have conversations with that "everyone."

This isn't (primarily) about fighting political/hype cycles, it's just... like, well one negative example I updated on: Eliezer defines meta-honesty to be "be at least as honest as a highly honest person AND ALSO always be honest about under what circumstances you will be honest." He tacked on the first part for a reason (to avoid accidentally encouraging people to use "metahonesty" for clever self-serving arguments). But, frankly, "metahonesty" is a pretty self-explanatory word if it just means the second thing, and most people will probably interpret it to mean just the "be honest about being honest" part.

I think the bundle-of-concepts Eliezer wanted to point to should be called something more like "Eliezer's Code of (Meta)-honesty" or something catchier but more oddly specific. And let "metahonesty" just be a technical term that isn't also trying to be a code of honor, that means what it sounds like it should mean.

...

Also, re: staying ahead of a political race. It's kinda reasonable to just Not Wanna Play That Game, but, note that a lot of the stakes here is not "doing politics Out There somewhere", it's having terminology that keeps making sense in the intellectual circles. If most of the people studying AI, even from perspective of AI safety, end up studing "weak superintelligences", trying to preserve a definition that uses it to always mean "overwhelmingly strong" is setting yourself up for a lot of annoying conversations just while trying to discuss concepts intellectually.

[-]Seth Herd2mo31

I think the explicit suggestion is to retreat to a more specific term rather than fight against the co-option of superintelligence to hype spikily human-level AI.

I agree that superintelligence has the right usage historically, and the right intuitive connotation.

Superman isn't slightly stronger than the strongest human, let alone the average. He's in a different category. That's what's evoked. But technically super just means better, so slightly better than human technically qualifies. So I see the term getting steadily co-opted for marketing, and agree we should have a separate term.

[-]Nisan2mo102

Consider the subtle difference between:

An AI smart enough to defeat all of humanity, if all of humanity coordinated to stop the AI.
An AI smart enough to defeat an institution as powerful as the United States, if the US coordinated to stop the AI.
An AI smart enough to defeat an institution as powerful as Google + OpenAI, if they coordinated to stop the AI.
Etc.

If humanity manages to muster only a fraction of our power to oppose a particular AI system, the AI system might win that contest without being as smart as the first item in the list.

[-]Buck2mo102

I agree that it's useful to have the concept of "Superintelligence that is so qualitatively intelligent that it's very hard for us to be confident about what it will or won't be able to accomplish, even given lots of constraints and limited resources." I usually use "galaxy-brained superintelligence" for this in conversation, but obviously that's a kind of dumb term. Maybe "massively qualitatively superintelligent" works? Bostrom uses "quality superintelligence".

OpenPhil talked about the concept of "transformative AI" specifically because they were trying to talk about a broader class of AIs (though galaxy-brained superintelligence was a core part of their concern).

I don't love "overwhelmingly superintelligent" because AIs don't necessarily have to be qualitatively smarter than humanity to overwhelm it—whether we are overwhelmed by AIs that are "superintelligent" (in the weak sense that they're qualitatively more intelligent than any human) IMO is affected by the quality of takeover countermeasures in place.

The type of AI I'm most directly worried about is "overwhelmingly superhuman compared to humanity." (And, AIs that might quickly bootstrap to become overwhelmingly superhuman).

I think it's a mistake to just mention that second thing as a parenthetical. There's a huge difference between AIs that are already galaxy-brained superintelligences and AIs that could quickly build galaxy-brained superintelligences or modify themselves into galaxy-brained superintelligences—we should try to prevent the former category of AIs from building galaxy-brained superintelligences in ways we don't approve of.

[-]Raemon1mo50

I don't love "overwhelmingly superintelligent" because AIs don't necessarily have to be qualitatively smarter than humanity to overwhelm it

I think this more feature-than-bug – the problem is that it's overwhelming. There are multiple ways to be overwhelming, what we want to avoid is a situation where an overwhelming, unfriendly AI exists. One way is not build AI of a given power level. The other is to increase the robustness of civilization. (I agree the term is fuzzy, but I think realistically the territory is fuzzy).

I think it's a mistake to just mention that second thing as a parenthetical. There's a huge difference between AIs that are already galaxy-brained superintelligences and AIs that could quickly build galaxy-brained superintelligences or modify themselves into galaxy-brained superintelligences—we should try to prevent the former category of AIs from building galaxy-brained superintelligences in ways we don't approve of.

(did you mean "latter category?")

Were you suggesting something other than "remove the parentheses?" Or did it seem like I was thinking about it in a confused way? Not sure which direction you thought the mistake was in.

(I think "already overwhelmingly strong" and "a short hop away from being overwhelming strong" are both real worrisome. The latter somewhat less worrisome, although I'd really prefer not building either until we are much more confident about alignment/intepretability)

[-]Buck1mo62

I think this more feature-than-bug – the problem is that it's overwhelming. There are multiple ways to be overwhelming, what we want to avoid is a situation where an overwhelming, unfriendly AI exists. One way is not build AI of a given power level. The other is to increase the robustness of civilization. (I agree the term is fuzzy, but I think realistically the territory is fuzzy).

When you're thinking about how to mitigate the risks, it really matters which of these we're talking about. I think there is some level of AI capability at which it's basically hopeless to control the AIs; this is what I use "galaxy-brained superintelligence" to refer to. If you just want to talk about AIs that pose substantial risk of takeover, you probably shouldn't use the word superintelligence in there, because they don't obviously have to be superintelligences to pose takeover risk. (And it's weird to use "overwhelmingly" as an adverb that modifies "superintelligent", because the overwhelmingness isn't about the level of intelligence, it's about that and also the world. You could say "overwhelming, superintelligent AI" if you want to talk specifically about AIs that are overwhelming and also superintelligent, but that's normally not what we want to talk about.)

[-]Raemon1mo20

I might retract the exact phrasing of my reply comment.

I think I was originally using overwhelmingly basically the way you're using "galaxy brained", and I feel like I have quibbles about the exact semantics of that phrase that feel about as substantial as your concern about overwhelming. (i.e. there is also a substantive difference between a very powerful brain hosted in a datacenter on Earth, and an AI that with a galaxy of resources)

What I mean by "overwhelmingly superintelligent" is "so fucking smart that humanity would have to have qualitatively changed in a similar orders-of-magnitude degree", which probably in practice means humans also have to have augmented their own intelligence, or have escalated their AI control schemes pretty far, carefully wielding significantly-[but-not-overwhelming/galaxy-brained]-AI that oversees all of Earth's security and is either aligned or the humans are really at threading the needle on control for quite powerful systems.

[-]Buck1mo42

Were you suggesting something other than "remove the parentheses?" Or did it seem like I was thinking about it in a confused way? Not sure which direction you thought the mistake was in.

I think that it is worth conceptually distinguishing AIs that are uncontrollable from AIs that are able to build uncontrollable AIs, because the way you should handle those two kinds of AI are importantly different.

[-]Raemon1mo20

I think I agree with that and didn't think of my post as claiming otherwise?

[-]One2mo61

perhaps "hyperintelligence" could be used for this

[-]Raemon2mo31

I think it's better to say words that mean particular things than trying to fight a treadmill of super/superduper/hyper/etc

[-]plex1mo20

I think of Decisive Strategic Advantage as the key differentiator, but not sure how best to make that into a short handle.

[-]Vanessa Kosoy1mo20

I am separately worried about "Carefully Controlled Moderate Superintelligences that we're running at scale, each instance of which is not threatening, but, we're running a lot of them...

I think that this particular distinction is not the critical one. What constitutes an "instance" is somewhat fuzzy. (A single reasoning thread? A system with a particular human/corporate owner? A particular source code? A particular utility function?) I think it's more useful to think in terms of machine intelligence suprasystems with strong internal coordination capabilities. That is, if we're somehow confident that the "instances" can't or won't coordinate either causally or acausally, then they are arguably truly "instances", but the more they can coordinate the more we should be thinking of them in the aggregate. (Hence, the most cautious risk estimate comes from comparing the sum total of all machine intelligence against the sum total of all human intelligence^[1].)

^{^}
More precisely, not even the sum total of all human intelligence, but the fraction of human intelligence that humans can effectively coordinate. See also comment by Nisan.

[-]Raemon2mo20

(I think at least part of what's going on is that there is a separate common belief that Superintelligent (compared to the single best humans) is enough to bootstrap to Overwhelming Superintelligence, and some of the MIRI vs Redwood debates are about how necessarily true that is)

[-]Buck2mo*40

I don't really understand what you're saying. I think it's very likely that [ETA: non-galaxy-brained] superintelligent AIs will be able to build galaxy-brained superintelligences within months to years if they are given (or can steal) the resources needed to produce them. I don't think it's obvious that they can do this with extremely limited resources.

[-]Raemon1mo20

I think (unconfidently guessing) that Eliezer is more bullish than you on "they can do this with pretty limited resources", and this leads to him caring less about the distinction between "weakly superhuman" and "overwhelmingly superhuman".

[-]Bronson Schoen1mo10

What would be keeping the resources extremely limited in this scenario? My understanding was control was always careful to specify that it was targeting the “near human level” regime.

[-]Buck1mo42

Yeah, I think control is unlikely to work for galaxy brained superintelligences. It's unclear how superintelligent they have to be before control is totally unworkable.

[-]Raemon1mo21

I think that's consistent with what Buck just said. (I interpreted him to be using superintelligent AI here to mean "near human level", and that those AIs would be able to develop successor galaxy-brain AI if they had enough resources, but, if you have sufficiently controlled them, they hopefully won't)

[-]Seth Herd2mo20

How about takeover-capable AI?

I've been thinking about this issue a fair amount, and that's my nomination. It points directly at what we care about. And it doesn't have the implication that an AI would need to be a whole different category of intelligence to take over. Your neanderthal example and the correction is relevant here: they're gone because sapiens had varied advantages, not because they were cleanly outclassed in intelligence.

Individual humans have taken over most of the world many times while being smarter than those around them only in pretty limited ways. It's important to consider scifi takeover scenarios, but old-fashioned social dominance ("hey it's better for you if you listen to me," applied iteratively) is also great, and would suffice.

[-]Dacyn2mo24

also because sharing the planet with a slightly smarter species still doesn’t seem like it bodes well. (See humans, neanderthals, chimpanzees).

From what I can tell from a quick Google search, current evidence doesn't show that neanderthals were any less smart than humans.

[-]Raemon2mo20

Yeah I don't super stand by the Neanderthal comment, was just grabbing an illustrative example.

I just did a heavy-thinking GPT-5 search, which said "we don't know for sure, there's some evidence that, on an individually they may have been comparably smart as us, but, we seem to have had the ability to acquire and share innovations." This might not be a direct intelligence thing, but, "having some infrastructure that makes you collectively smarter as a group" still counts for my purposes.

[-]Random Developer1mo10

I think if anyone builds Overwhelmed Superintelligence without hitting a pretty narrow alignment target, everyone probably dies.

I fear that even in most of the narrow cases where the superintelligence is controlled, we're probably still pretty thoroughly screwed. Because then you need to ask, "Precisely who controls it?" Given a choice between Anthropic totally losing control of a future Claude, and Sam Altman having tight personal control over GPT Omega ("The last GPT you'll ever build, humans"), which scenario is actually the most scary? (If you have a lot of personal trust in Sam Altman, substitute your least favorite AI lab CEO or a small committee of powerful politicians from a party you dislike.)

also because sharing the planet with a slightly smarter species still doesn't seem like it bodes well. (See humans, neanderthals, chimpanzees).

Yeah, unless you believe in ridiculously strong forms of alignment, and unprecedentedly good political systems to control the AIs, the whole situation seems horribly unstable. I'm slightly more optimistic about early AGI alignment than Yudkowsky, but I actually might be more pessimistic about the long term.

[-]jrincayc2mo10

Overwhelming superintelligence sounds like a useful term. A term I started using is independence gaining artificial general intelligence as the threshold for when we need to start being concerned about the AGI's alignment. An AI program that is sufficiently intelligent to be able to gain independence, such as by creating a self-replicating computer capable of obtaining energy and other things needed to achieve goals without any further assistance from humans.

For example, an independence gaining AGI connected to today's internet might complete intellectual tasks for money and then use the money to mail order printed circuit boards and other hardware. An independence gaining AGI with access to 1800s level technology might mine coal and build a steam engine to power a Babbage-like computer and then bootstrap to faster computing elements. An independence gaining AGI on Earth's moon might be able to produce solar panels and CPUs from the elements in the moon's crust, and produce an electromagnetic rail to launch probes off the moon. Of course, how smart the AGI has to be to gain independence is a function of what kind of hardware the AGI can get access to. An overwhelming superintelligence might be able to take over the planet with just access to a hardware random number generator and a high precision timer, but a computer controlling a factory could probably be less intelligent and still be able to gain independence.

One of the reasons I started using the term is because human level AGI is vague, and we don't know if we should be concerned by a human level AGI. Also, to determine if something is human level, we need to specify human level in what? 1950s computers were superhuman at arithmetic, but not chess, so is a 1950s computer human level or not? It may be hard to determine of a given computer + software is capable of gaining independence, but it is a more exact definition than just human level AGI.

[-]bfinn2mo21

How about ‘out-of-control superintelligence’? (Either because it’s uncontrollable or at least not controlled.) Which carries the appropriately alarming connotations that it’s doing its own thing and that we can’t stop it (or aren’t doing so anyway)

[-]jrincayc2mo30

I think this may be proving Raemon's point that there are a wide range of concepts. I consider the lower amount of alignment connotation of independence gaining a feature, not a bug, since we can say things like ethical independence gaining AGI or aligned independence gaining AGI without it sounding like an oxymoron. Also, I am not sure superintelligence is required to gain independence, since it may be possible to just think longer than a human to gain independence without thinking faster. That said, if out-of-control superintelligence is the right concept you are trying to get across, then use that.

Moderation Log