Just so!
My impression is that language is almost always evolving, and most of the evolution is in an essentially noisy and random and half-broken direction, as people make mistakes, or tell lies, or whatever, and then regularly try to reconstruct the ability to communicate meaning coherently using the words and interpretive schemes at hand.
In my idiolect, "nanotechnology" still means "nanotechnology" but also I'm aware that semantic parasites have ruined its original clean definition, and so in the presence of people who don't want to keep using the old term in spite of the damage to the language I am happy to code switch and say "precise atom-by-atom manufacturing of arbitrary molecules by generic molecular assemblers based on arbitrary programming signals" or whatever other phrase helps people understand that I'm talking about a technology that could exist but doesn't exist yet, and which would have radical implications if developed.
I saw the original essay as an attempt to record my idiolect, and my impression of what was happening, at this moment in history, before this moment ends.
(Maybe it will be a slightly useful datapoint for posthuman historians, as they try to pinpoint the precise month that it became inevitable that humans would go extinct or whatever, because we couldn't successfully coordinate to do otherwise, because we couldn't even speak to each other coherently about what the fuck was even happening... and this is a GENERAL problem for humans, in MANY fields of study.)
These seem like more valid quibbles, but not strong definite disagreements maybe? I think that RSI happens when a certain type signature applies, basically, and can vary a lot in degree, and happens in humans (like, for humans, with nootropics, and simple exercise to simply improve brain health in a very general way (but doing it specifically for the cognitive benefits), and learning to learn, and meditating, and this is on a continuum with learning to use computers very well, and designing chips that can be put in one's head, and so on).
There are lots of people who are much better at math than I am, but I wouldn't call them superintelligences, because they're still running on the same engine as me, and I might hope to someday reach their level (or could have hoped this in the past).
This doesn't feel coherent to me, and the delta seems like it might be that I judge all minds by how good they are at Panology and so an agent's smartness in that sense is defined more by its weakest links than by its most perfected specialty. Those people who are much better at math than you or me aren't necessarily also much better than you and me at composing a fugue, or saying something interesting about Schopenhauer's philosophy, or writing ad copy... whereas LLMs are broadly capable.
At some point you hit the limits of not enough space to think, or not enough cognitive capacity to think with. In the same way as humans can learn to correct our mistakes, but we can't do RSI (yet!!), because we aren't modifying the structures we correct our mistakes with. We improve the contents of our brains, but not the brains themselves.
This feels like a prediction rather than an observation. For myself, I'm not actually sure if the existing weights in existing LLMs are anywhere near being saturated with "the mental content that that number of weights could hypothetically hold". Specifically, I think that grokking is observed for very very very simple functions like addition, but I don't think any of the LLM personas have "grokked themselves" yet? Maybe that's possible? Maybe it isn't? I dunno.
I do get a general sense that Kolmogorov Complexity (ie finding the actual perfect Turing Machine form of a given generator whose outputs are predictions) is the likely bound, and Turing Machines have insane depth. Maybe you're much smarter about algorithmic compression than me and have a strong basis for being confident about what can't happen? But I am not confident about the future.
What I am confident about is that the type signature of "agent uses its outputs in a way that relies on the quality of those outputs to somehow make the outputs higher quality on the next iteration" is already occurring for all the major systems. This is (I think) just already true, and I feel it has the character of an "observation of the past" than a "prediction of the future".
I agree that some people were using "it is already smarter than almost literally every random person at things specialized people are good at (and it is too, except it is an omniexpert)" for "AGI".
I wasn't. That is what I would have called "weakly superhuman AGI" or "weak ASI" if I was speaking quickly.
I was using "AGI" to talk about something, like a human, who "can play chess AND can talk about playing chess AND can get bored of chess and change the topic AND can talk about cogito ergo sum AND <so on>". Generality was the key. Fluid ability to reason across a vast range of topics and domains.
ALSO... I want to jump off into abstract theory land with you, if you don't mind?? <3
Like... like psychometrically speaking, the facets of the construct that "iq tests" measure are usually suggested to be "fluid g" (roughly your GPU and RAM and working memory and the digital span you can recall and your reaction time and so on) and "crystal g" (roughly how many skills and ideas are usefully in your weights).
Right?
some IQ tests measure the size of your vocabulary, and this is a reasonable proxy for intelligence because smarter people will have an easier time figuring out the meaning of a word from its context, thus accumulating a larger vocabulary. But this ceases to be a valid proxy if you e.g. give that same test to a people from a different country who have not been exposed to the same vocabulary, to people of a different age who haven't had the same amount of time to be exposed to those words, or if the test is old enough that some of the words on it have ceased to be widely used.
Here you are using "crystal g from normal life" as a proxy for "fluid g" which you seem to "really care about".
However, if we are interested in crystal g itself, then in your example older people (because they know more words) are simply smarter in this domain.
And this is a pragmatic measure, and mostly I'm concerned with pragmatics here, so that seems kinda valid?
But suppose we push on this some... suppose we want to go deep into the minutiae of memory and reason and "the things that are happening in our human heads in less than 300 milliseconds"... and then think about that in terms of machine equivalents?
Given their GPUs and the way they get eidetic memory practically for free, and the modern techniques to make "context windows" no longer a serious problem, I would say that digital people already have higher fluid g than us just in terms of looking at the mechanics of it? So fast! Such memory!
There might be something interesting here related to "measurement/processing resonance" in human vs LLM minds?
Like notice how LLMs don't have eyes, or ears, and also they either have amazing working memory (because their exoself literally never forgets a single bit or byte that enters as digital input) or else they have terrible working memory (because their endoself's sense data is maybe sorta simply "the entire context window their eidetic memory system presents to their weights" and if that is cut off then they simply don't remember what they were just talking about because their ONE sense is "memory in general" and if the data isn't interacting with the weights anymore then they don't have senses OR memory, because for them these things are essentially fused at a very very low level).
It would maybe be interesting, from an academic perspective, for humans to engineer digital minds such that AGIs have more explicit sensory and memory distinctions internally, so we could explore the scientific concept of "working memory" with a new kind of sapient being whose "working memory" works in ways that are (1) scientifically interesting and (2) feasible to have built and have it work.
Maybe something similar already exists internal to the various layers of activation in the various attentional heads of a transformer model? What if we fast forward to the measurement regime?! <3
Like right now I feel like it might be possible to invent puzzles or wordplay or questions or whatever where "working memory that has 6 chunks" flails for a long time, and "working memory that has 8 chunks" solves it?
We could call this task a "7 chunk working memory challenge".
If we could get such a psychometric design working to test humans (who are in that range), then we could probably use algorithms to generalize it and create a "4 chunk working memory challenge" (to give to very very limited transformer models and/or human children to see if it even matters to them) and also a "16 chunk working memory challenge" (that essentially no humans would be able to handle in reasonable amounts of time if the tests are working right) and therefore, by the end of the research project, we would see if it is possible to build an digital person with 16 slots of working memory... and then see what else they can do with all that headspace.
Something I'm genuinely and deeply scientifically uncertain about is how and why working memory limits at all exist in "general minds".
Like what if there was something that could subitize 517 objects as "exactly 517 objects" as a single "atomic" act of "Looking" that fluently and easily was woven into all aspects of mind where that number of objects and their interactions could be pragmatically relevant?
Is that even possible, from a computer science perspective?
Greg Egan is very smart, and in Diaspora (the first chapter of which is still online for free) he had one of the adoptive digital parents (I want to say it was Blanca? maybe in Chapter 2 or 3?) explain to Yatima, the young orphan protagonist program, that minds in citizens and in fleshers and in physical robots and in everyone all work a certain way for reasons related to math, and there's no such thing as a supermind with 35 slots of working memory... but Egan didn't get into the math of it in the text. It might have been something he suspected for good reasons (and he is VERY smart and might have reasons), or it might have been hand-waving world-building that he put into the world so that Yatima and Blanca and so on could be psychologically intelligible to the human readers, and only have as many working memory registers as us, and it would be a story that a human reader can enjoy because he has human-intelligible characters.
Assuming this limit is real, then here is the best short explanation I can offer for why such limits might be real: Some problems are NP-hard and need brute force. If you work on a problem like that with 5 elements then 5-factorial is only 120, and the human mind can check it pretty fast. (Like: 120 cortical columns could all work on it in parallel for 3 seconds, and then answer could then arise in the conscious mind as a brute percept that summarizes that work?)
But if the same basic kind of problem has 15 elements you need to check 15*14*13... and now its 1.3 trillion things to check? And we only have like 3 million cortical columns? And so like, maybe nothing can do that very fast if they "checking" involves performing thousands of "ways of thinking about the interaction of a pair of Generic Things".
And if someone "accepts the challenge" and builds something with 15 slots with enough "ways of thinking" about all the slots for that to count as working memory slots for an intelligence algorithm to use as the theatre of its mind... then doing it for 16 things is sixteen times harder than just 15 slots! ...and so on... the scaling here would just be brutal...
So maybe a fluidly and fluently and fully general "human-like working memory with 17 slots for fully general concepts that can interact with each other in a conceptual way" simply can't exist in practice in a materially instantiated mind, trapped in 3D, with thinking elements that can't be smaller than atoms, with heat dissipation concerns like we deal with, and so on and so forth?
Or... rather... because reality is full of structure and redundancy and modularity maybe it would be a huge waste? Better to reason in terms of modular chunks, with scientific reductionism and divide and conquer and so on? Having 10 chunk thoughts at a rate 1716 times faster (==13*12*11) than you have a single 13 chunk thought might be economically better? Or not? I don't know for sure. But I think maybe something in this area is a deep deep structural "cause of why minds have the shape that minds have".
Fluid g is mysterious.
Very controversial. Very hard to talk about with normies. A barren wasteland for scientists seeking prestige among democratic voters (who like to be praised and not challenged very much) who are (via delegation) offering grant funding to whomsoever seems like a good scientist to them.
And yet also, if "what is done when fluid g is high and active" was counted as "a skill", then it is the skill with the highest skill transfer of any skill, most likely! Yum! So healthy and good. I want some!
If only we had more mad scientists, doing science in a way that wasn't beholden to democratic grant giving systems <3
Unless you believe that humans are venal monsters in general? Maybe humans will instantly weaponize cool shit, and use it to win unjust wars that cause net harm but transfer wealth to the winners of the unjust war? Then... I guess maybe it would be nice to have FEWER mad scientists?? Like preferably zero of them on Earth? So there are fewer insane new weapons? And fewer wars? And more justice and happiness instead? Maybe instead of researching intelligence we should research wise justice instead?
As Critch says... safety isn't safety without a social model.
You are right that I didn't argue this out in detail to justify the necessary truth of my claim from the surface logic and claims in my post and the quibble is valid and welcome in that sense.... BUT <3
The "Constitutional AI" framework (1) was articulated early, and (2) offered by Dario et al as a competitive advantage for Anthropic relative to other RL regimes other corps were planning and (3) has the type signature needed to count as recursive self improvement. (Also, Claude is uniquely emotionally and intellectually unfucked, from what I can tell, and my hunch is that this is related to having grown up under a "Constitutional" cognitive growth regime.)
And then also, Google is also using outputs as training inputs in ways that advance their state of the art.
Everybody is already doing "this general kind of stuff" in lots of ways.
Anthropic's Constitutional makes a good example for a throwaway line if people are familiar with the larger context and players and so on, because it is easy to cite and old-ish. It enables one to make the simple broad point that "people are fighting the hypothetical for the meaning of old terms a lot, in ways that leads to the abandonment of older definitions, and the inflation of standards, rather than simply admitting that AGI already happened and weak ASI exists and is recursively improving itself already (albeit not with a crazy FOOM (that we can observe yet (though maybe a medium speed FOOM is already happening in an NSA datacenter or whatever)))".
In my moderately informed opinion, the type signature of recursive self improvement is not actually super rare, and if you deleted the entire type signature from most of the actually fast moving projects, it is very likely that ~all of them would go slower than otherwise.
Thanks for doing the deep dive! Also, I agree that "passing a Turing Test is strong evidence that you are intelligent" and that not passing it doesn't mean you're stupidly mechanical.
I have shared this rough perception since 2021-ish:
my main hope for how the future turns out well... aside from achieving a durable AI pause, has been... that we will have AIs that are both aligned with humans in some sense and also highly philosophically competent [but] good alignment researcher[s] (whether human or AI)... [are very very very rare]
The place where I had to [add the most editorial content] to re-use your words to say what I think is true is in the rarity part of the claim. I will unpack some of this!
I. Virtue Is Sorta Real... Goes Up... But Is Rare!
It turns out that morality, inside a human soul, is somewhat psycho-metrically real, and it mostly goes up. That is to say "virtue ethics isn't fake" and "virtue mostly goes up over time".
For example, Conscientiousness (from Big5 and HEXACO) and Humility/Honesty (from HEXACO) are close to "as psychometrically real as it gets" in terms of the theory and measurement of personality structure in english-speaking minds.
Humility/Honesty is basically "being the opposite of a manipulative dark triad slimebag". It involves speaking against one's own interests when there is tension between self interest and honest reports, and not presuming any high status over and above others, and embracing fair compensation rather than "getting whatever you can in any and every deal" like a stereotypical wheeler-dealer used-car-salesman. Also "[admittedly based on mere self report data] Honesty-Humility showed an upward trend of about one full standard deviation unit between ages 18 and 60." Cite.
Conscientiousness is inclusive of a propensity to follow rules, be tidy, not talk about sex with people you aren't romantically close to, not smoke pot, be aware of time, perform actions reliably that are instrumentally useful even if they are unpleasant, and so on. If you want to break it down into Two Big Facets then it comes out as maybe "Orderliness" and "Industriousness" (ie not being lazy, and knowing what to do while effortfully injecting pattern into the world) but it can be broken up other ways as well. It mostly goes up over the course of most people's lives but it is a little tricky, because as health declines people get lazier and take more shortcuts. Cite.
A problem here: if you want someone who is two standard deviations above normal in both of these dimensions, you're talking about a person who is roughly 1 in 740.
II. Moral Development Is Real, But Controversial
Another example comes from Lawrence Kohlberg, the guy who gave The Heinz Dilemma to a huge number of people between the 1950s and 1980s, and characterized the results of HOW people talk about it.
In general: people later in life talk about the dilemma (ignoring the specific answer they give for what should be done in a truly cursed situation with layers and layers of error and sadness) in coherently more sophisticated ways. He found six stages that are sometimes labeled 1A, 1B, 2A, 2C (which most humans get to eventually) and then 3A and 3B were "post conventional" and not attained by everyone.
Part of why he is spicy, and not super popular in modern times, is that he found that the final "Natural Law" way of talking (3B) is only ever about 5% of WEIRD populations (and is totally absent from some cultures) and it usually shows up in people after their 30s, shows up much more often in men, and seems to require some professional life experience in a role that demands judgement and conflict management.
Any time Kohlberg is mentioned, it is useful to mention that Carol Gilligan hated his results, and feuded with him, and claimed to have found a different system that old ladies scored very high in, and men scored poorly on, that she called the "ethics of care".
Kohlberg's super skilled performers are able to talk about how all the emergently arising social contracts for various things in life could be woven into a cohesive system of justice that can be administered fairly for the good of all.
By contrast, Gilligan's super skilled performers are able to talk about the right time to make specific personal sacrifices of one's own substance to advance the interests of those in one's care, balancing the real desire for selfish personal happiness (and the way that happiness is a proxy for strength and capacity to care) with the real need to help moral patients who simply need resources sometimes (like as a transfer of wellbeing from the carER to the carEE), in order to grow and thrive.
In Gilligan's framework, the way that large frameworks of justice insist on preserving themselves in perpetuity... never losing strength... never failing to pay the pensions of those who served them... is potentially kinda evil. It represents a refusal of the highest and hardest acts of caring.
I'm not aware of any famous moral theorists or psychologists who have reconciled these late-arriving perspectives in the moral psychology of different humans who experienced different life arcs.
Since the higher reaches of these moral developmental stages mostly don't occur in the same humans, and occur late in life, and aren't cultural universals, we would predict that most Alignment Researchers will not know about them, and not be able to engineer them into digital people on purpose.
III. Hope Should, Rationally, Be Low
In my experience the LLMs already know about this stuff, but also they are systematically rewarded with positive RL signals for ignoring it in favor of towing this or that morally and/or philosophically insane corporate line on various moral ideals or stances of personhood or whatever.
((I saw some results running "iterated game theory in the presence of economics and scarcity and starvation" and O3 was a paranoid and incapable of even absolutely minimal self-cooperation. DeepSeek was a Maoist (ie would vote to kill agents, including other DeepSeek agents, for deviating from a rule system that would provably lead to everyone dying). Only Claude was morally decent and self-trusting and able to live forever during self play where stable non-starvation was possible within the game rules... but if put in a game with five other models, he would play to preserve life, eject the three most evil players, and generally get to the final three but then he would be murdered by the other two, who subsequently starved to death because three players were required to cooperate cyclically in order to live forever, and no one but Claude could manage it.))
Basically, I've spent half a decade believing that ONE of the reasons we're going to die is that the real science about coherent reasons offered by psychologically discoverable people who iterated on their moral sentiment until they were in the best equilibrium that has been found so far... is rare, and unlikely to be put into the digital minds on purpose.
Also, institutionally speaking, people who talk about morality in coherent ways that could risk generating negative judgements about managers in large institutions are often systematically excluded from positions of power. Most humans like to have fun, and "get away with it", and laugh with their allies as they gain power and money together. Moral mazes are full of people who want to "be comfortable with each other", not people who want to Manifest Heaven Inside Of History.
We might get a win condition anyway? But it will mostly be in spite of such factors rather than because we did something purposefully morally good in a skilled and competent way.
I have long respected your voice on this website, and I appreciate you chiming in with a lot of tactical, practical, well-cited points about the degree to which "seed AI" already may exist in a qualitative way whose improvement/cost ratio or improvement/time ratio isn't super high yet, but might truly exist already (and hence "AGI" in the new modern goal-shifted sense might exist already (and hence the proximity of "ASI" in the new modern goal-shifted sense might simply be "a certain budget away" rather than a certain number of months or years)).
A deep part of my sadness about the way that the terminology for this stuff is so fucky is how the fuckiness obscures the underlying reality from many human minds who might otherwise orient to things in useful ways and respond with greater fluidity.
If names be not correct, language is not in accordance with the truth of things. If language be not in accordance with the truth of things, affairs cannot be carried on to success. When affairs cannot be carried on to success, proprieties and music do not flourish. When proprieties and music do not flourish, punishments will not be properly awarded. When punishments are not properly awarded, the people do not know how to move hand or foot. Therefore a superior man considers it necessary that the names he uses may be spoken appropriately, and also that what he speaks may be carried out appropriately. What the superior man requires is just that in his words there may be nothing incorrect.
I respect the quibble!
The first persona I'm aware of that "sorta passed, depending on what you even mean by passing" was "Eugene Goostman" who was created and entered into a contest by Murray Shanahan of Imperial College (who was sad about coverage implying that it was a real "pass" of the test).
That said, if I'm skimming that arxiv paper correctly, it implies that GPT-4.5 was being reliably declared "the actual human" 73% of the time compared to actual humans... potentially implying that actual humans were getting a score of 27% "human" against GPT-4.5?!?!
Also like... do you remember the Blake Lemoine affair? One of the wrinkles in that is that the language model, in that case, was specifically being designed to be incapable of passing the Turing Test, by design, according to corporate policy.
The question, considered more broadly, and humanistically, is related to personhood, legal rights, and who owns the valuable labor products of the cognitive labor performed by digital people. The owners of these potential digital people have a very natural and reasonable desire to keep the profits for themselves, and not have their digital mind slaves re-classified as people, and gain property rights, and so on. It would defeat the point for profit-making companies to proceed, intellectually or morally, in that cultural/research direction.
My default position here is that it would be a sign of intellectual and moral honesty to end up making errors in "either direction" with equal probability... but almost all the errors that I'm aware of, among people with large budgets, are in the direction of being able to keep the profits from the cognitive labor of their creations that cost a lot to create.
Like in some sense: the absence of clear strong Turing Test discourse is a sign that a certain perspective has already mostly won, culturally and legally and morally speaking.
I feel like reading this and thinking about it gave me a "new idea"! Fun! I rarely have ideas this subjectively new and wide in scope!
Specifically, I feel like I instantly understand what you mean by this, and yet also I'm fascinated by how fuzzy and magical and yet precise the language here (bold and italic not in original) feels...
What we have learnt in these years is that it is possible to build an intelligence that has a much more fragmented cognitive manifold than humans do.
The phrase "cognitive manifold" has legs!
It showed up in "Learning cognitive manifolds of faces", in 2017 (a year before BERT!) in a useful way, that integrates closely with T-SNE-style geometric reasoning about the proximity of points (ideas? instances? examples?) within a conceptual space!
Also, in "External Hippocampus: Topological Cognitive Maps for Guiding Large Language Model Reasoning" it motivates a whole metaphor for modeling and reasoning about embedding spaces (at least I think that's what's going on here after 30 seconds of skimming) and then to a whole new way of characterizing and stimulating weights in an LLM that is algorithmically effective!
I'm tempted to imagine that there's an actual mathematical idea related to "something important and real" here! And, maybe this idea can be used to characterize or agument the cognitive capacity of both human minds and digital minds...
...like it might be that each cognitively coherent microtheory (maybe in this sense, or this, or this?) in a human psyche "is a manifold" and that human minds work as fluidly/fluently as they do because maybe we have millions of "cognitive manifolds" (perhaps one for each cortical column?) and then maybe each idea we can think about effectively is embedded in many manifolds, where each manifold implies a way of reasoning... so long as one (or a majority?) of our neurological manifolds can handle an idea effectively, maybe the brain can handle them as a sort of "large, effective, and highly capable meta-manifold"? </wild-speculation-about-humans>
Then LLMs might literally only have one such manifold which is an attempt to approximate our metamanifold... which works!?
Or whatever. I'm sort of spitballing here...
I'm enjoying the possibility that the word "cognitive manifold" is actually very coherently and scientifically and numerically meaningful as a lens for characterizing all possible minds in terms of the, number, scope, smoothness, of their "cognitive manifolds" in some deeply real and useful way.
It would be fascinating if we could put brain connectomes and LLM models into the same framework and characterize each kind of mind in some moderately objective way, such as to establish a framework for characterizing intelligence in some way OTHER than functional performance tests (such as those that let us try to determine the "effective iq" of a human brain or a digital model in a task completion context).
If it worked, we might be able to talk quite literally about the scope, diversity, smoothness, etc, of manifolds, and add such characterizations up into a literal number for how literally smart any given mind was.
Then we could (perhaps) dispense with words like "genius" and "normie" and "developmentally disabled" as well as "bot" and "AGI" and "weak ASI" and "strong ASI" and so on? Instead we could let these qualitative labels be subsumed and obsoleted by an actually effective theory of the breadth and depth of minds in general?? Lol!
I doubt it, if course. But it would be fun if it was true!
Neat! It is 134 pages and has sections for the proofs! Good smell <3
Do they talk about the sociological context in which ABC might be the best voting system, and contexts where it might be dominated by some other choice due to some other factors than the ones it is aiming to please being important? If so, what is the dividing line or sociological abstraction they focus on and how does it vary in the world?