We should have been trying hard to retrospectively construct new explanations that would have predicted the observations. Instead we went with the best PREEXISTING explanation that we already had
Yeah, that seems to be a real problem. E. g., it was previously fairly reasonable to believe that "knowing what to do" and "knowing how to do it" were on a spectrum, with any given "what to do" being a "how to do" at big-enough scale; or that there's no fundamental difference between a competent remixing of existing ideas and genuine innovation. I believed that.
LLMs seem to be significant evidence that those models are flawed. There apparently is some qualitative difference between autonomy and instruction-following; between innovation and remixing. But tons of people still seem to think that "LLMs are stochastic parrots" and "LLMs' ability to do any kind of reasoning means they are on a continuous path to human-level AGI" are the only two positions, based on their pre-2022 models of agency.
... based on their pre-2022 models of agency.
This seems like an overly good-faith / mistake-theoretic explanation of the false dichotomy (which is not to say never applicable). This is a dialectical social dynamic; each side gains credit with its supporters by using the other side's bad arguments as a foil, conspicuously ignoring the possibility of positions outside the binary.
I’d be interested in reading a good treatment of this conjecture
This is Lenin's theory of revolution. You need to have a party governed by "democratic centralism" (each member must obey every decision of the party as a whole). It needs to wait for a "revolutionary situation" (weakness of elites and discontent of masses), then act decisively to take and keep power. Every successful revolutionary in the 20th century has studied this theory.
It’s not AGI (even in a narrow sense of a human-equivalent software engineer, mathematician, and AI researcher).
However, the question is: how long till it is accelerating AI research more and more noticeably and more and more autonomously?
This mode might start way before AGI (even in the narrow sense of the word I mention above) is achieved…
I agree that you can speed up [the research that is ratcheting up to AGI] like this. I don't agree that it is LIKELY that you speed [the research that is ratcheting up to AGI] up by A LOT (>2x, IDK). I'm not saying almost anything about "how fast OpenAI and other similar guys can do whatever they're currently doing". I don't think that's very relevant, because I think that stuff is pretty unlikely (<4%) to lead to AGI by 2030.
Why not likely? Because
The actual ideas aren't things that you can just crank out at 10x the rate if you have 10x the spare time. They're the sort of thing that you usually get 0 of in your life, 1 if you're lucky, several if you're an epochal genius. They involve deep context and long struggle. (Probably. Just going off of history of ideas. Some random prompt or o5-big-thinky-gippity-tweakity-150128 could kill everyone. But <4%.)
OpenAI et al. aren't AFAIK working on that stuff. If you work on not-the-thing 3x faster, so what?
I think OpenAI, DeepMind and other “big labs” work on many things. They release some of them.
But also many other people work on many non-standard things, e.g. Sakana AI, Liquid AI and others, and with non-standard approaches compute might be less crucial, and open reasoning models like the new R1 are only a few months behind the leading released closed source models, so those smaller orgs can have their research accelerated as well even if they decide they don’t want to use OpenAI/DeepMind/Anthropic models.
I'm not talking about a slightly different model, I'm talking about having ideas. I don't see clearly how you use LLMs including reasoning models to have good AGI ideas much faster. I also don't think anyone else does, they're just vaguely thinking there might be.
I like talking to LLMs and discussing things (including various ideas some of which might be related to advanced AI or to AI existential safety). And that helps somewhat.
But if I can ask them to implement my ideas end-to-end in prototype code and to conduct computational experiments for me end-to-end, my personal velocity would grow an order of magnitude, if not more...
And if I can ask them to explore variations and recombinations of my ideas, then...
It's not that people don't have good AGI ideas, it's more that the road from a promising idea to a viable implementation is difficult, and a lot of good ideas remain "hanging in the air" (that is, insufficiently tested for us to know if those promising ideas are actually good).
But if I can ask them to implement my ideas end-to-end in prototype code and to conduct computational experiments for me end-to-end, my personal velocity would grow an order of magnitude, if not more...
If your ideas are things that are commonly implemented, then sure--but then they are much much less likely to be novel ideas progressing towards AGI. If they are novel ideas, it's much much harder for current LLMs to rightly implement them.
It's not that people don't have good AGI ideas, it's more that the road from a promising idea to a viable implementation is difficult,
No I think it's the first one.
If they are novel ideas, it's much much harder for current LLMs to rightly implement them.
That's not my experience.
Even with the original GPT-4, asking for some rather non-standard things which I was unable to find online (e.g. immutable transformations of nested dictionaries in Python, so that I could compute JAX gradients with respect to variables located in the nodes of the trees in question) resulted in nice implementations (I had no idea how to even start doing that in Python, I only knew that trying to literally port my Clojure code doing the same thing would be a really bad idea).
Not one-shot, of course, but conversations with several turns have been quite productive in this sense.
And that's where we have decent progress lately and expect further rapid progress soon (this year, I think, so I would expect a much better one-shot experience in this sense soon).
No I think it's the first one.
It depends on whether one expects "the solution" to be purist or hybrid.
If the expectation is "purist", then the barrier is high (not insurmountable, I think, but I am a bit reluctant to discuss the technical details in public).
If the expectation is that the solution will be "hybrid" (e.g. powerful LLMs can be used as components, and one can have read-write access to the internals of those LLMs), then it should be very doable (we have not even started to really "unhobble" those LLMs, reasoning models non-withstanding). Quite a bit of non-triviality still remains even under a "hybrid" approach, but the remaining gap is much lower than for "purist" attempts. It's obvious that LLMs are more than "half-way there", even if it might be true that "something fundamental is still missing".
Even with the original GPT-4, asking for some rather non-standard things which I was unable to find online (e.g. immutable transformations of nested dictionaries in Python, so that I could compute JAX gradients with respect to variables located in the nodes of the trees in question) resulted in nice implementations (I had no idea how to even start doing that in Python, I only knew that trying to literally port my Clojure code doing the same thing would be a really bad idea).
Ok cool, but this is just not the sort of thing we're supposed to be talking about. We're supposed to be talking about novel ideas about minds / intelligence, and then implementing them in a way that really tests the hypothesis.
It's obvious that LLMs are more than "half-way there",
Go ahead and make an argument then.
We're supposed to be talking about novel ideas about minds / intelligence, and then implementing them in a way that really tests the hypothesis.
No, just about how to actually make non-saturating recursive self-improvement ("intelligence explosion").
Well, with the added constraint of not killing everyone or almost everyone, and not making almost everyone miserable either...
(Speaking of which, I noticed recently that some of people's attempts at recursive self-improvement now take longer to saturate than before. And, in fact, they are taking long enough that people are sometimes publishing before pushing them to saturation, so we don't even know what would happen if they were to simply continue pushing a bit harder.)
Now, implementing those things to test those ideas can actually be quite unsafe (that's basically "mini-foom" experiments, and people are not talking enough about safety of those). So before pushing harder in this direction, it would be better to do some preliminary work to reduce risks of such experiments...
Go ahead and make an argument then.
Yes, LLMs mostly have reliability/steerability problem. I am seeing plenty of "strokes of genius" in LLMs, so the potential is there. They are not "dumb", they have good creativity (in my experience).
We just can't get them to reliably compose, verify, backtrack, and so on to produce the overall quality work. Their "fuzzy error correction" still works less well than human "fuzzy error correction", at least on some of the relevant scales. So they eventually accumulate too many errors on the long horizon tasks and they don't self-correct enough on the long horizon tasks.
This sounds to me more like a "character upbringing problem" than a purely technical problem...
That's especially obvious when one reads reasoning traces of reasoning models...
What I see there sounds to me as if their "orientation" is still wrong (those models seem to be thinking about how to satisfy their user or their maker, and not about how to "do the right thing", whereas a good human solving a math problem ignores the aspect of satisfying their teacher or their parent, and just tries to do "an objectively good job", and that's where LLMs are still falling short)...
I am seeing plenty of "strokes of genius" in LLMs, so the potential is there. They are not "dumb", they have good creativity (in my experience).
Examples? What makes you think they are strokes of genius (as opposed to the thing already being in the training data, or being actually easy)?
I don't know. I started (my experience talking with GPT-4 and such) with asking it to analyze a 200 lines of non-standard code with comments stripped out. It correctly figured out that I was using nested dictionaries to represent vector-like objects, and that that was an implementation of a non-standard unusually flexible (but slow) neural machine.
This was obviously the case of "true understanding" (and it was quite difficult to reproduce, as the models evolved the ability to analyze this code well was lost, then eventually regained in better models; those better models eventually figured even more non-trivial things about that non-standard implementation, e.g. at some point newer models started to notice on their own that that particular neural machine was inherently self-modifying; anyway, very obvious evolution from inept pattern matching to good understanding, with some setbacks during the evolution of models, but eventually with good progress towards better and better performance).
Then I asked it to creatively modify and creatively remix some Shadertoy shaders, and it did a very good job (even more so if one considers that that model was visually blind and was unable to see the animations produced by its shaders). Nothing too difficult, but things like taking a function from one of the shaders and adding a call to this function from another shader with impressive visual effects... Again, with all the simplicity, it was more than would have occurred to me, if I were trying to do this manually...
But when I tried to manually iterate these steps to obtain "evolution of interesting shaders", I got a rapid saturation, not an unlimited interesting evolution...
So not bad at all (I occasionally do rather creative things, but it is always an effort, so on the occasions when I am unable to successfully do this kind of effort, I start to feel that the model might be more creative than "me in my usual mode" (although, I don't know if these models are already competitive with "me in my peak mode")).
When they first introduced Code Interpreter, I asked it to solve a math competition problem, and it did a nice job, and then I asked it what would I do, if I want to do it only using a pen and a paper with limited precision (it was a problem with a huge answer), and it told me to take logarithms, and demonstrated how to do that with logarithms.
That immutable tree processing I mentioned was good in this sense, very elegant, taught me some Python tricks I have not known.
Then when reasoning models were first introduced I asked for a linear algebra problem (which I could not solve myself, but people doing math competitions could), and weaker models could not do it, but o1-preview could one-shot it.
(All these are conversations I am publishing on github, so if one wants to take a closer look, one can.)
Anyway, my impression is that it's not just training data, it's more than that. This is not doable without reasonable understanding, without good intuition.
At the same time, they can be lazy, can be sloppy and make embarrassing mistakes (which sometimes don't prevent them from proceeding to the right result). But it's the reliability which is a problem, not creative capability which seems to be quite robust (at least, on the "medium creativity" setting).
Conjecture: when there is regime change, the default outcome is for a faction to take over—whichever faction is best prepared to seize power by force.
One example: The Iranian Revolution of 1978-1979. In the years leading up to the revolution, there was turmoil and broad hostility towards the Shah, across many sectors of the population. These hostilities ultimately combined in an escalation of protest, crack-down, more protest from more sectors (protests, worker strikes). Finally, the popular support for Khomeini as the flag-bearer of the broad-based revolution was enough to get the armed forces to defect, ending the Shah's rule.
From the Britannica article on the aftermath:
On April 1, following overwhelming support in a national referendum, Khomeini declared Iran an Islamic republic. Elements within the clergy promptly moved to exclude their former left-wing, nationalist, and intellectual allies from any positions of power in the new regime, and a return to conservative social values was enforced. The Family Protection Act (1967; significantly amended in 1975), which provided further guarantees and rights to women in marriage, was declared void, and mosque-based revolutionary bands known as komītehs (Persian: “committees”) patrolled the streets enforcing Islamic codes of dress and behaviour and dispatching impromptu justice to perceived enemies of the revolution. Throughout most of 1979 the Revolutionary Guards—then an informal religious militia formed by Khomeini to forestall another CIA-backed coup as in the days of Mosaddegh—engaged in similar activity, aimed at intimidating and repressing political groups not under the control of the ruling Revolutionary Council and its sister Islamic Republican Party, both clerical organizations loyal to Khomeini. The violence and brutality often exceeded that which had taken place under the shah.
(What resulted in the following decades, was a brutally repressive and regionally violently corrosive theocratic regime.)
So we have a trajectory that goes like this:
I'm probably inaccurately oversimplifying the Iranian revolution, because I don't know the history. So this is only a conjecture. Other possible examples:
(I'd be interested in reading a good treatment of this conjecture.)
Large language models where a shock to almost everyone's anticipations. We didn't expect to have AI systems that can talk, do math, program, read, etc. (Or at least, do versions of those activities that are only distinguishable from the real versions if you pay close attention.)
There are two common reactions to this shock:
The first reaction is to deny that there's something that demands a large update. The second reaction is to make a specific update: We see generally intelligent output, so we update that we have AGI. I have argued that there should have been, inter alia, another update:
There is a missing update. We see impressive behavior by LLMs. We rightly update that we've invented a surprisingly generally intelligent thing. But we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought.
It's pretty weird that LLMs can do what they can do, but so far haven't done anything that's interesting and superhuman and general. We didn't expect that beforehand. Our previous hypotheses are not good.
We should have been trying hard to retrospectively construct new explanations that would have predicted the observations. Instead we went with the best PREEXISTING explanation that we already had. Since "nothing to see here" is, comparatively, a shittier explanation than "AGI ACHIEVED", we go with the latter. Since all our previous hypotheses were not good, we become confident in not-good hypotheses.
Finally, we have the seizing of power. Due to deference and a desire to live in a shared world, the hypothesis that survived the culling takes over.
Some readers will be thinking of Kuhn. But in Kuhn's story, the new paradigm is supposed to better explain things. It's supposed to explain both the old phenomena and also the anomalies that busted the old paradigm.
Here, instead, we have a power vacuum. There are no good explanations, no good alternative paradigms. We have a violent revolution, not a scientific one, in which the hypotheses that get promoted are those whose adherents were best prepared to seize mindshare.