TLDR:

  1. Around Einstein-level, relatively small changes in intelligence can lead to large changes in what one is capable to accomplish.
    1. E.g. Einstein was a bit better than the other best physi at seeing deep connections and reasoning, but was able to accomplish much more in terms of impressive scientific output.
  2. There are architectures where small changes can have significant effects on intelligence.
    1. E.g. small changes in human-brain-hyperparameters: Einstein’s brain didn’t need to be trained on 3x the compute than normal physics professors for him to become much better at forming deep understanding, even without intelligence improving intelligence.

Einstein and the heavytail of human intelligence

1905 is often described as the "annus mirabilis" of Albert Einstein. He founded quantum physics by postulating the existence of (light) quanta, explained Brownian motion, introduced the special relativity theory and derived E=mc² from it. All of this. In one year. While having a full-time job in the Swiss patent office.

With the exception of John von Neumann, we’d say those discoveries alone seem more than any other scientist of the 20th century achieved in their lifetime (though it's debatable). 

Though perhaps even more impressive is that Einstein was able to derive general relativity.

Einstein was often so far ahead of his time that even years after he published his theories the majority of physicists rejected them because they couldn’t understand them, sometimes even though there was experimental evidence favoring Einstein's theories. After solving the greatest open physics problems at the time in 1905, he continued working in the patent office until 1908, since the universities were too slow on the uptake to hire him earlier.

Example for how far ahead of his time Einstein was: Deriving the theory of light quanta

The following section is based on parts of the 8th chapter of “Surfaces and Essences” by Douglas Hofstadter. For an analysis of some of Einstein's discoveries, which show how far ahead of his time he was, I can recommend reading it.

 At the time, one of the biggest problems in physics was the “Blackbody spectrum”, which describes the spectrum of electromagnetic wavelengths emitted by a Blackbody. The problem with it was that the emitted spectrum was not explainable by known physics. Einstein achieved a breakthrough by considering light not just as a wave, but also as light quanta. Although this idea sufficiently explained the Blackbody spectrum, physicists (at least almost) unanimously rejected it. The fight between the “light is corpuscles” and “light is a wave” faction had been decided a century ago, with a clear victory for the “wave” faction. 

Being aware of these possible doubts, Einstein proposed three experiments to prove his idea, one of which was the photoelectric effect. In the following years, Robert Millikan carried out various experiments on the photoelectric effect, which all confirmed Einstein’s predictions. Still, Millikan insisted that the light-quanta theory had no theoretical basis and even falsely claimed that Einstein himself did not believe in his idea anymore.

From Surfaces and Essences (p.611): 

To add insult to injury, although the 1921 Nobel Prize in Physics was awarded to Albert Einstein, it was not for his theory of light quanta but “for his discovery of the law of the photoelectric effect”. Weirdly, in the citation there was no mention of the ideas behind that law, since no one on the Nobel Committee (or in all of physics) believed in them! [1][...] And thus Albert Einstein’s revolutionary ideas on the nature of light, that most fundamental and all-pervading of natural phenomena, were not what won him the only Nobel Prize that he would ever receive; instead, it was just his little equation concerning the infinitely less significant photoelectric effect. It’s as if the highly discriminating Guide Michelin, in awarding its tiptop rank of three stars to Albert’s Auberge, had systematically ignored its chef’s consistently marvelous five-course meals and had cited merely the fact that the Auberge serves very fine coffee afterwards.

Concluding thoughts on Einstein

Einstein was able to reason through very complex arguments he constructed via thought experiments without making a mistake. He was able to generalize extremely well from other physics discoveries, to get a sense of the underlying nature of physical law. I believe that what enabled Einstein to make key discoveries much faster than the whole remaining field of theoretical physics combined (which itself contained many of the smartest people at the time) was that he was smarter in some dimensions of intelligence than all other 20th century scientists (rather than him just being born with good physics-particular intuitions).[2][3]

Takeaways

  1. Capabilities are likely to cascade once you get to Einstein-level intelligence, not just because an AI will likely be able to form a good understanding of how it works and use this to optimize itself to become smarter[4][5], but also because it empirically seems to be the case that when you’re slightly better than all other humans at stuff like seeing deep connections between phenomena, this can enable you to solve hard tasks like particular research problems much much faster (as the example of Einstein suggests).
    1. Aka: Around Einstein-level, relatively small changes in intelligence can lead to large changes in what one is capable to accomplish.
  2. For human brains, small changes in hyperparameters can lead to very significant increases in intelligence.[6] Intuitively, one would suspect that scaling up training compute by 2x is a significantly larger change than than having a +6.4std hyperparameter sample instead of a +5.4std one, even though it is not obvious to me that 2x training compute would get you from "great physics professor" to "Einstein" if we had transformer architectures. So either there is some grokking cascade around genius level intelligence where capabilities can quickly be learned and improved, or it's just that (human) brains scale significantly faster in performance than transformers currently seem to.
    1. Aka: For at least some architectures, around genius-level, small changes in hyperparameters (or perhaps also compute) can lead to relatively large changes in intelligence.
  3. Compute-based AI capability forecasting is unlikely to work well, since this entirely neglects the significant intelligence gap between Einstein and average humans.

Requests to AI researchers

Nobody currently knows how to align strongly superhumanly smart AIs to human interests, and we need way more time to solve this problem. Making incremental progress on AI capabilities is shortening the timeline we have left to figure out how to align AI and is thus making human extinction more likely. Thus by far the best action is to stop advancing AI capabilities.

Absent this, please be aware that capabilities might rapidly cascade around genius or supergenius level intelligence and take measures accordingly. In particular:

  1. Monitor how quickly performance of an AI is improving in training.
  2. When capability is performing unusually quickly: stop and audit.
    1. Do not ignore warning signs. If warning signs show up, stop training and coordinate with governments and other AI labs to get more time to solve the alignment problem.
    2. If the audit is fine, scale up slowly and continue to carefully audit unusual training dynamics.
  3. Generally perform regular and precise safety audits while scaling up.
  4. Be especially careful when scaling up new architectures or training setups. There likely exist architectures which scale much faster than transformers and might reach superhuman intelligence without needing nearly as much compute as the current best models.
  1. ^

    I am not confident that the doubts of Einstein's light quanta-theory in 1921 were as big as portrayed here. Still: Millikan's work, in which he wrote the above-mentioned false claims, was published in 1917, so it's reasonable that 4 years later there were still some confusions. Though the doubts (at least mostly) ended in 1923 with the discovery of Compton's effect.

  2. ^

    The fact that human intelligence is very heavy-tailed can also be observed in other examples like e.g. John von Neumann.

  3. ^

    One natural hypothesis that could explain large changes in capability from small changes in hyperparameters is that the small changes enabled the agent to make itself smarter (and then smarter again, though with the improvements getting smaller so it's below the threshold where it fooms). But this does NOT seem to be the driving factor which made Einstein able to accomplish so much more. Thus this post is warning about other kinds of capability cascades which seem to exist.

  4. ^

    We think intelligence improving intelligence is an important part of why we at some point expect a fast takeoff (though until then capabilities might continue to improve continuously for qutie a while). This post is showing that there is empirical evidence which suggest rapid capability gain might happen even without intelligence improving intelligence. Though it is plausible to us that intelligence improving intelligence is the more important factor, and at least for AIs significantly smarter than Einstein this seems likely.

  5. ^

    It seems plausible that AIs will be able to significantly improve themselves or speed up AI research before they are fully as smart as Einstein in all dimensions.

  6. ^

    The following seems plausible (but by no means close to certain): “The base architecture of the human brain is very capable, as capable as Einstein was or even more, but evolution didn’t figure out how to align humans, who are very smart in some dimensions, to optimize well for genetic fitness. Thus, people who were e.g. extraordinarily reflective had less kids in the ancestral environment, so most people today have some alignment-patches, which evolution designed into them, which nerf their intelligence (in particular dimensions). Part of the explanation for why Einstein was so smart was that he had unusually few alignment-patches that nerfed his brain. So the existence of Einstein isn’t strong evidence that some hyperparamter changes can lead to very rapid capability increases if the base architecture isn’t nerfed and actually already more capable.”. This might be true, but I still find it very surprising under this hypothesis that Einstein (and John von Neumann) was so much smarter than many of the next runner-ups who also had few alignment-patches. The point that seemingly small increases in some dimensions of intelligence at Einstein level can have huge effects on capability still carries.

New Comment
16 comments, sorted by Click to highlight new comments since:

That's an interesting argument. However, something similar to your hypothetical explanation in footnote 6 suggests the following hypothesis: Most humans aren't optimized by evolution to be good at abstract physics reasoning, while they easily could have been, with evolutionary small changes in hyperparameters. After all Einstein wasn't too dissimilar in training/inference compute and architecture from the rest of us. This explanation seems somewhat plausible, since highly abstract reasoning ability perhaps wasn't very useful for most of human history.

(An argument in a similar direction is the existence of Savant syndrome, which implies that quite small differences in brain hyperparameters can lead to strongly increased narrow capabilities of some form, which likely weren't useful in the ancestral environment, which explains why humans generally don't have them. The Einstein case suggests a similar phenomenon may also exists for more general abstract reasoning.)

If this is right, humans would be analogous to very strong base LLMs with poor instruction tuning, where the instruction tuning (for example) only involved narrow instruction-execution pairs that are more or less directly related to finding food in the wilderness, survival and reproduction. Which would lead to bad performance at many tasks not closely related to fitness, e.g. on Math benchmarks. The point is that a lot of the "raw intelligence" of the base LLM couldn't be accessed just because the model wasn't tuned to be good at diverse abstract tasks, even though it easily could have been, without a big change in architecture or training/inference compute.

But then it seems unlikely that artificial ML models (like LLMs) are or will be unoptimized for highly abstract reasoning in the same way evolution apparently didn't "care" to make us all great at abstract physics and math style thinking. Since AI models are indeed actively optimized in diverse abstract directions. Which would make it unlikely to get a large capability jump (analogous to Einstein or von Neumann) just from tweaking the hyperparameters a bit, since those are probably pretty optimized already.

If this explanation is assumed to be true, it would mean we shouldn't expect sudden large (Einstein-like) capability gains once AI models reach Einstein-like ability.

The (your) alternative explanation is that there is indeed at some point a phase transition at a certain intelligence level, which leads to big gains just from small tweaks in hyperparameters. Perhaps because of something like the "grokking cascade" you mentioned. That would mean Einstein wasn't so good at physics because he happened to be, unlike most humans, "optimized for abstract reasoning", but because he reached an intelligence level where some grokking cascade, or something like that, occurs naturally. Then indeed a similar thing could easily happen for AI at some point.

I'm not sure which explanation is better.

Some evidence in favor of your explanation (being at least a correct partial explanation):

  1. von Neuman apparently envied Einstein's physics intuitions, while Einstein lacked von Neuman's math skills. This seems to suggest that they were "tuned" in slightly different directions.
  2. Neither of the two seem superhumanly accomplished in other areas (that a smart person/agent might have goals for), such as making money, moral/philosophical progress, changing culture/politics in their preferred direction.

(An alternative explanation for 2 is that they could have been superhuman in other areas but their terminal goals did not chain through instrumental goals in those areas, which in turn raises the question of what those terminal goals must have been for this explanation to be true and what that says about human values.)

I note that under your explanation, someone could surprise the world by tuning a not-particularly-advanced AI for a task nobody previously thought to tune AI for, or by inventing a better tuning method (either general or specialized), thus achieving a large capability jump in one or more domains. Not sure how worrisome this is though.

As it turns out, von Neumann was good at lots of things.

https://qualiacomputing.com/2018/06/21/john-von-neumann/

Von Neumann himself was perpetually interested in many fields unrelated to science. Several years ago his wife gave him a 21-volume Cambridge History set, and she is sure he memorized every name and fact in the books. “He is a major expert on all the royal family trees in Europe,” a friend said once. “He can tell you who fell in love with whom, and why, what obscure cousin this or that czar married, how many illegitimate children he had and so on.” One night during the Princeton days a world-famous expert on Byzantine history came to the Von Neumann house for a party. “Johnny and the professor got into a corner and began discussing some obscure facet,” recalls a friend who was there. “Then an argument arose over a date. Johnny insisted it was this, the professor that. So Johnny said, ‘Let’s get the book.’ They looked it up and Johnny was right. A few weeks later the professor was invited to the Von Neumann house again. He called Mrs. von Neumann and said jokingly, ‘I’ll come if Johnny promises not to discuss Byzantine history. Everybody thinks I am the world’s greatest expert in it and I want them to keep on thinking that.'”

 

____

According to the same article, he was not such a great driver.

Now, comparing him to another famous figure of his age, Menachem Mendel Schneerson.  Schneerson was legendary for his ability to recall obscure sections of Torah verbatim, and his insightful reasoning (I am speaking lightly here, his impact was incredible). Using the hypothetical that von Neumann and Schneerson had a similar gift (their ability with the written word as a reflection of their general ability), depending on your worldview, Schneerson's talents were not properly put to use in the service of science, or von Neumann's talents were wasted in not becoming a gaon. 

Perhaps, if von Neumann had engaged in Torah instead of science, we could have been spared nuclear weapons and maybe even AI for some time.  Sure, maybe someone else would have done what he did...but who?

Capabilities are likely to cascade once you get to Einstein-level intelligence, not just because an AI will likely be able to form a good understanding of how it works and use this to optimize itself to become smarter[4][5], but also because it empirically seems to be the case that when you’re slightly better than all other humans at stuff like seeing deep connections between phenomena, this can enable you to solve hard tasks like particular research problems much much faster (as the example of Einstein suggests).

  1. Aka: Around Einstein-level, relatively small changes in intelligence can lead to large changes in what one is capable to accomplish.

OK but if that were true then there would have been many more Einstein like breakthroughs since then. More likely is that such low hanging fruit have been plucked and a similar intellect is well into diminishing returns. That is given our current technological society and >50 year history of smart people trying to work on everything if there are such breakthroughs to be made, then the IQ required is now higher than in Einsteins day.

I think you are misjudging the mental attributes that are conducive to scientific breakthroughs. 

My (not very well informed) understanding is that Einstein was not especially brilliant in terms of raw brainpower (better at math and such than the average person, of course, but not much better than the average physicist). His advantage was instead being able to envision theories that did not occur to other people. What might be described as high creativity rather than high intelligence.

Other attributes conducive to breakthroughs are a willingness to work on high-risk, high-reward problems (much celebrated by granting agencies today, but not actually favoured), a willingness to pursue unfashionable research directions, skepticism of the correctness of established doctrine, and a certain arrogance of thinking they can make a breakthrough, combined with a humility allowing them to discard ideas of theirs that aren't working out. 

So I think the fact that there are more high-IQ researchers today than ever before does not necessarily imply that there is little "low hanging fruit".

Not following - where could the 'low hanging fruit' possibly be hiding? We have many of "Other attributes conducive to breakthroughs are a ..." in our world of 8 billion. The data strongly suggests we are in diminishing returns. What qualities could an AI of Einstein intelligence realistically have that would let it make such progress where no person has. It would seem you would need to appeal to other less well defined qualities such as 'creativity' and argue that for some reason the AI would have much more of that. But that seems similar to just arguing that it in fact has > Einstein intelligence.

I'm not attempting to speculate on what might be possible for an AI.  I'm saying that there may be much low-hanging fruit potentially accessible to humans, despite there now being many high-IQ researchers. Note that the other attributes I mention are more culturally-influenced than IQ, so it's possible that they are uncommon now despite there being 8 billion people.

[-]quila6-4

I hadn't considered this argument, thanks for sharing it.

It seems to rest on this implicit piece of reasoning: 
(premise 1) If modelling human intelligence as a normal distribution, it's statistically more probable that the most intelligent human will only be so by a small amount.
(premise 2) One of the plausibly most intelligent humans was capable of doing much better than other highly intelligent humans in their field. 
(conclusion) It's probable that past some threshold, small increases in intelligence lead to great increases in output quality.

It's ambiguous what 'intelligence' refers to here if we decouple that word from the quality of insight one is capable of. Here's a way of re-framing this conclusion to make it more quantifiable/discussable: "Past some threshold, as a system's quality of insight increases, the optimization required (for evolution or a training process) to select for a system capable of greater insight decreases".

The level this becomes true at would need to be higher than any AI's so far, otherwise we would observe training processes easily optimizing these systems into superintelligences instead of loss curves stabilizing at some point above 0. 

I feel uncertain whether there are conceptual reasons (priors) for this conclusion being true or untrue.

I'm also not confident that human intelligence is normally distributed in the upper limits, because I don't expect there are known strong theoretical reasons to believe this.

Overall it seems at least a two digit probability given the plausibility of the premises.

At the time, one of the biggest problems in physics was the “Blackbody spectrum”, which describes the spectrum of electromagnetic wavelengths emitted by a Blackbody. The problem with it was that the emitted spectrum was not explainable by known physics. Einstein achieved a breakthrough by considering light not just as a wave, but also as light quanta. Although this idea sufficiently explained the Blackbody spectrum, physicists (at least almost) unanimously rejected it. The fight between the “light is corpuscles” and “light is a wave” faction had been decided a century ago, with a clear victory for the “wave” faction. 

 

I thought blackbody radiation was Planck, not Einstein.

(I think) Planck found the formula that matched the empirically observed distribution, but had no explanation for why it should hold. Einstein found the justification for this formula.

I lean towards agreeing with the takeaway; I made a similar argument here and would still bet on the slope being very steep inside the human intelligence level. 

Einstein achieved a breakthrough by considering light not just as a wave, but also as light quanta. Although this idea sufficiently explained the Blackbody spectrum, physicists (at least almost) unanimously rejected it.

IIRC Planck had introduced quantized energy levels of light before Einstein. However, unlike Einstein he didn't take his method seriously enough to recognize that he had discovered a new paradigm of physics.

[-]der10

When capability is performing unusually quickly

Assuming you meant "capability is improving." I expect capability will always feel like it's improving slowly in an AI researcher's own work, though... :-/ I'm sure you're aware that many commenters have suggested this as an explanation for why AI researchers seem less concerned than outsiders.

Nobody currently knows how to align strongly superhumanly smart AIs to human interests, and we need way more time to solve this problem. Making incremental progress on AI capabilities is shortening the timeline we have left to figure out how to align AI and is thus making human extinction more likely. Thus by far the best action is to stop advancing AI capabilities.

It seems that not much research is done into studying invariant properties of rapidly self-modifying ecosystems. At least, when I did some search and also asked here a few months ago, not much came up: https://www.lesswrong.com/posts/sDapsTwvcDvoHe7ga/what-is-known-about-invariants-in-self-modifying-systems.

It's not possible to have a handle on the dynamics of rapidly self-modifying ecosystems without better understanding how to think about properties conserved during self-modification. And ecosystems with rapidly increasing capabilities will be strongly self-modifying.

However, any progress in this direction is likely to be dual-use. Knowing how to think about self-modification invariants is very important for AI existential safety and is also likely to be a strong capability booster.

This is a very typical conundrum for AI existential safety. We can try to push harder to make sure that the research into invariant properties of self-modifying (eco)systems is an active research area again, but the likely side-effect of better understanding properties of potentially fooming systems is making it easier to bring these systems into existence. And we don't have good understanding of proper ways to handle this kind of situations (although the topic of dual-use is discussed here from time to time).

I think research on what you propose should definitely not be public and I'd recommend against publicly trying to push this alignment agenda.

I think this is a good description of the problem. The fact that Einstein's brain had a similar amount of compute and data, similar overall architecture, similar fundamental learning algorithm means that a brain-like algorithm can substantially improve in capability without big changes to these things. How similar to the brain's learning algorithm does an ML algorithm have to be before we should expect similar effects? That seems unclear to me. I think a lot of people who try to make forecasts about AI progress are greatly underestimating the potential impact of algorithm development, and how the rate of algorithmic progress could be accelerated by large-scale automated searches by sub-AGI models like GPT-5.

A related market I have on manifold:

https://manifold.markets/NathanHelmBurger/gpt5-plus-scaffolding-and-inference

https://manifold.markets/NathanHelmBurger/1hour-agi-a-system-capable-of-any-c

A related comment I made on a different post:

https://www.lesswrong.com/posts/sfWPjmfZY4Q5qFC5o/why-i-m-doing-pauseai?commentId=p2avaaRpyqXnMrvWE