My AI Predictions for 2027

[-]elifland3mo400

Thanks for writing this up, glad to see the engagement! I've only skimmed and have not run this by any other AI 2027 authors, but a few thoughts on particular sections:

My predictions for AI by the end of 2027

I agree with most but not all of these in the median case, AI 2027 was roughly my 80th percentile aggressiveness prediction at the time.

Edited to add, I feel like I should list the ones that I have <50% on explicitly:

AI still can't tell novel funny jokes, write clever prose, generate great business ideas, invent new in-demand products, or generate important scientific breakthroughs, except by accident.

I disagree re: novel funny jokes, seems plausible that this bar has already been passed. I agree with the rest except maybe clever prose, depending on the operationalization.

LLMs are broadly acknowledged to be plateauing, and there is a broader discussion about what kind of AI will have to replace it.

Disagree but not super confident.

Most breakthroughs in AI are not a result of directly increasing the general intelligence/"IQ" of the model, e.g. advances in memory, reasoning or agency. AI can stay on task much longer than before without supervision, especially for well-specified

... (read more)

8Taylor G. Lunt3mo

Thank you for taking the time to write such a detailed response. My main critique of AI 2027 is not about communication, but the estimates themselves (2027 is an insane median estimate for AI doom) and that I feel you're overconfident about the quality/reliability of the forecasts. (And I am glad that you and Daniel have both backed off a bit from the original 2027 estimate.) Probably this is related to communication issues on timelines, yes. Also, I think if I genuinely believed everyone I knew and loved was going to die in ~2 years, I would probably be acting a certain way that I don't sense from the authors of the AI 2027 document. But I don't want to get too much into mind reading. With respect to the communication issue, I think the AI 2027 document did include enough disclaimers about the authors' uncertainty, and more disclaimers wouldn't help. I think the problem is that the document structurally contradicts those disclaimers, by seeming really academic and precise. Adding disclaimers to the research sections would also not be valuable simply because most people won't get that far. Including a written scenario is something I can understand why you chose to do, but it also seems like a mistake for the reasons I mentioned in my post. It makes you sound way more confident than we both agree you actually are. And a specific scenario is also more likely to be wrong than a general forecast. You have said things like: * "One reason I'm hesitant to add [disclaimers] is that I think it might update non-rationalists too much toward thinking it's useless, when in fact I think it's pretty informative." * "The graphs are the result of an actual model that I think is reasonable to give substantial weight to in one's timelines estimates." * "In our initial tweet, Daniel said it was a 'deeply researched' scenario forecast. This still seems accurate to me." * "we put quite a lot of work into it" * "it's state-of-the-art or close on most dimensions and represents s

6elifland3mo

Okay, it sounds like our disagrement basically boils down to the value of the forecasts as well as the value of the scenario format (does that seem right?), which I don't think is something we'll come to agreement on. Thanks again for writing this up! I hope you're right about timelines being much longer and 2027 being insane (as I mentioned, it's faster than my median has ever been, but I think it's plausible enough to take seriously). edit: I'd also be curious for you to specify what you mean by academic? The scenario itself seems like a very unusual format for academia. I think it would have seemed more serious academic-y if we had ditched the scenario format.

3Taylor G. Lunt3mo

Perhaps we will find some agreement come Christmastime 2027. Until then, thanks for your time! edit: Responding to your edit, by seeming academic, I meant things like seeming "detailed and evidence-based", "involving citations and footnotes", "involving robust statistics", "resulting in high-confidence conclusions", and stuff like that. Even the typography and multiple authors makes it seem Very Serious. I agree that the scenario part seemed less academic that the research pages.

[-]Daniel Kokotajlo3mo71

I think they're way off. I was visualizing my self at Christmastime 2027, sipping eggnog and gloating about how right I was,

Reading further it seems like you are basically just saying "Timelines are longer than 2027." You'll be interested to know that we actually all agree on that. Perhaps you are more confident than us; what are your timelines exactly? Where is your 50% mark for the superhuman coder milestone being reached? (Or if you prefer a different milestone like AGI or ASI, go ahead and say that)

6StanislavKrym3mo

Unfortunately, it's hard to predict it. I did describe how Grok 4[1] and GPT-5 are arguably evidence that the accelerated doubling trend between GPT4o and o3 is replaced by something slower. As far as I understand, were the slower trend to repeat METR's original law (GPT2-GPT4?[2]), we would obtain the 2030s. But, as you remark, "we should have some credence on new breakthroughs<...> that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with." The actual probability of the breakthrough is likely a crux: you believe it to be 8% a year and I think of potential architectures waiting to be tried. One such architecture is diffusion models[3] which have actually been previewed and could be waiting to be released. So assuming world peace, the timeline could end up being modeled by a combination of scaling compute up and few algorithmic breakthroughs with random acceleration effects, and each breakthrough would have to be somehow distributed by the amount of research done, then have the most powerful Agent trained to use the breakthrough, as happens with Agent-3 and Agent-4 created from Agent-2 in the forecast. Maybe a blog post explaining more about your timelines and how they've updated would help? The worse-case scenario[4] also has timelines affected by compute deficiency. For instance, the Taiwan invasion is thought to happen by 2027 and could be likely to prompt the USG to force the companies to merge and to race (to AI takeover) as hard as they can. 1. ^ Grok 4 is also known to have been trained by spending similar amounts of compute on pretraining and RL. Is it also known about GPT-5? 2. ^ GPT-4 and GPT-4o were released in March 2023 and May 2024 and had only one doubling in 14 months. Something hit a plateau, then in June 2024 Anthropic released Claude 3.5 Sonnet (old), and a new trend began. As of now, the trend likely ended at o3, and Grok 4 and GPT5 are apparently in the same pa

4Daniel Kokotajlo3mo

Yeah we are working on it sorry!

5Taylor G. Lunt3mo

Hey Daniel, I loved the podcast with Dwarkesh and Scott Alexander. I am glad you have gotten people talking about this, though I'm of two minds about it, because as I say in my post, I believe your estimates in the AI 2027 document are very aggressive (and there were some communication issues there I discussed with Eli in another comment). I worry what might happen in 2028 if basically the entire scenario described on the main page turns out to not happen, which is what I believe. My blog post is a reaction to the AI 2027 document as it stands, which doesn't seem to have any banner at the top saying the authors no longer endorse the findings or anything. The domain name and favicon haven't changed. I am now aware that you have adjusted your median timeline from 2027 to 2028, and that Eli has adjusted his estimate as well. The scenario itself describes some pretty crazy things happening in 2027, and the median estimate in the research (before adjustments) seems to be 2027 for SC, SAR, SAIR, and ASI all in 2027. I definitely don't agree that any of those milestones will be reached in 2027, nor in 2028. Of course, predicting the future is hard the further out in time we go. I have a strongish sense for 2027, but I would put SC, SAR, SAIR, and ASI all at least ten years further than that, and it's really hard to know what the heck will be going on in the world by then, because unlike you I don't believe this is a matter of continuous progress (for reasons mentioned in my post), and discontinuous progress is hard to predict. In 1902 Simon Newcomb said, "Flight by machines heavier than air is unpractical and insignificant, if not utterly impossible." The next year the Wright brothers took off. I'm not sure trying to put a number on it is better than simply saying, "I don't know." In the spirit of answering the question you asked, I'd predict SC in the year 2050. But this is such a low-confidence prediction as to be essentially worthless, like saying I think there's a 5

4Daniel Kokotajlo3mo

Thanks for the critique & the reply btw! Very much appreciate you giving quantitative alternative credences to mine, it's a productive way to focus the conversation I think. Footnote #1 used to be attached to "We wrote a scenario..." but we wanted it to be more prominent in response to exactly the sort of criticism you are making, so we moved it up to be literally a footnote on the title. I suppose we could have put it in the main text itself. My median is currently 2029 actually; at the time AI 2027 was published it was 2028. OK, thanks, that's helpful. So yeah, while we agree that SC probably won't happen by 2027 EOY, we do still have a disagreement -- I think it probably WILL happen in the next five years or so (and the rest of the team thinks it'll probably happen in the next ten years or so) whereas you seem confident it WON'T happen before 2037. I hope you are right! I agree also that the future is very hard to predict, especially the farther out it is (and 2037+ is very far out) There's a lot to say about why I think SC will probably happen in the next five years or so. I'll go leave line-by-line comments in the relevant section of your post!

4Taylor G. Lunt3mo

Yeah, someone pointed out that footnote to me, and I laughed a bit. It's very small and easy to miss. I don't think you guys actually misrepresented anything. It's clear from reading the research section what your actual timelines are and so on. I'm just pointing to communication issues. Thanks for your responses! I'll check them out.

4StanislavKrym3mo

Talking about 2027, the authors did inform the readers in a footnote, but revisions of the timelines forecast turned out to be hard to deliver to the general public. Let's wait for @Daniel Kokotajlo to state his opinion on the doubts related to SOTA architecture. In my opinion these problems would be resolved by a neuralese architecture or an architecture which could be an even bigger breakthrough (neuralese with big internal memory?)

[-]Daniel Kokotajlo3mo60

Imagine you had to solve a deep problem, but you were forced to pause your thinking every ten seconds. After every ten seconds of thinking, you had to write down one word and then have your memory of the problem wiped, conscious and unconscious. At the beginning of the next ten seconds, all you'd have access to is the words you'd written so far. Every deep exploration into combinatorial space would be cut short and have to begin anew. All your implicit memory about which paths seemed promising and which went nowhere would be wiped, as would any mental abst

... (read more)

4Taylor G. Lunt3mo

I think that in theory there is nothing wrong with having your memory wiped every iteration, and that such an architecture could in theory get us to SC. I just think it's not very efficient and there would be a lot of repeated computation happening between predicting each word.

4Daniel Kokotajlo3mo

I mean yeah, totally agree re: repeated computation and inefficiency. But there's no rule that says the first SC has to be close to the limits of efficiency. On the contrary, just as how the first viable airplanes were extremely shitty compared to the airplanes of today, the first viable SC will probably be shitty in various ways (e.g. data-efficiency) and perhaps this'll be one of those ways.

[-]p.b.3mo61

I think even in the case that AI 2027 is directionally correct (very fast AI progress) the concrete details are likely to be wrong, so I'm not sure how impressed one should be if your predictions turn out to be correct.

About "it's all just vibes": AI 2027 is strongly based on the METR time horizon analysis. I think it would be more fruitful to critique and analyse that. Stuff like the time from SC to SAI seems like epicycles. Though the biggest uncertainty in AI 2027 probably comes from the assumption of recursive improvement.

I am not sure how fruitf... (read more)

1Taylor G. Lunt3mo

I am predicting a world that looks fantastically different from the world predicted by AI 2027. It's the difference between apocalypse and things basically being the same as they are now. The difference between the two is clear. I agree that having internal representations that can be modified while reasoning is something that enables deep thinking, and I think this is something LLMs are bad at. Because of the wideness/depth issue and the lack of recurrence. I only have a lay understanding of how LLMs work, so forgive me if I'm wrong about the specifics. It seems to me the KV cache is just an optimization. Either way, the LLM's output is deterministic on the input tokens, and information is not being lost. What I was pointing to was the fact that the feed forward networks for the new token don't have access to the past feed-forward states of the other tokens, so they can't see e.g. what reasoning paths were dead ends, unless information about those dead ends made it into the output. This is a toy example, but I'm imagining a far-future LLM with enough understanding of biology and chemistry baked into its (for some reason very huge wide/deep) feed-forward networks to cure cancer in a single layer (for some layer). Imagine in one run, the input is "the cure for cancer". Imagine the attention dimension is very narrow. In one layer, the feed-forward network may cure cancer in this run, among doing many other things, and then possibly discard that information when going to the next layer. In a subsequent run on the input "the cure for cancer is", it may cure cancer again, and this time include some detail of that cure in its output to the next layer, since now it's more likely to be relevant to predicting the next token. When curing cancer the second time, it didn't have access to any of the processing from the first time. Only what previous layers outputted for previous tokens. Does that sound right? If so, the fact that the LLM is strictly divided into layers with fe

2p.b.3mo

That is the misconception. I'll try to explain it in my words (because frankly despite knowing how a transformer works, I can't understand Radford Neal's explanation). In the GPT architecture each token starts out as an embedding, which is then in each layer enriched with information from previous tokens and knowledge stored in the nn itself. So you have a vector which is modified in each layer, let's call the output of the n-th layer: vn The computation of vn accesses the vn−1 of all previous tokens! So in your example, if in layer n−1 at some token the cure for cancer is discovered, all following tokens will have access to that information in layer n. The model cannot forget this information. It might never access it again, but the information will always be there for the taking. This is in contrast to a recurrent neural network that might actually forget important information if it is unfortunate in editing its state.

1Taylor G. Lunt3mo

I believe I understood Radford Neal's explanation and I understand yours, as best I can tell, and I don't think it so far contradicts my model of how LLMs work. I am aware that the computation of vn has access to vn−1 of all previous tokens. But vn−1 are just the outputs of the feed-forward networks of the previous layer. Imagine a case where the output was 1000 times smaller than the widest part of the feed-forward network. In that case, most of the information in the feed-forward network would be "lost" (unavailable to vn). Of course, you assume if the model is well-trained, the most pertinent information to predicting the next token would make it into the output. But "the most pertinent information" and "all the information" are two different things, and some information might seem more relevant than now that the new token's appeared, leading to duplicate work or even cases where previous run happened to understand something the subsequent run did not. As Radford Neal also mentioned, the fact that the model may/may not properly use information from previous states is another possible issue. This is all pretty complicated so hopefully what I'm saying is clear.

3p.b.3mo

The function of the feedforward components in transformers is mostly to store knowledge and to enrich the token vectors with that knowledge. The wider you make the ff-network the more knowledge you can store. The network is trained to put the relevant knowledge from the wide hidden layer into the output (i.e. into the token stream). I fail to see the problem in the fact that the hidden activation is not accessible to future tokens. The ff-nn is just a component to store and inject knowledge. It is wide because it has to store a lot of knowledge, not because the hidden activation has to be wide. The full content of the hidden activation in isolation just is not that relevant. Case in point: Nowadays the ff-nns actually look different than in GPT-3. They have two hidden layers with one acting as a gating mechanism: The design has changed to allow the possibility to actively erase part of the hidden state! Also: This seems very different from what you are talking about in the post, it has nothing to do with "the next run". The hidden layer activations aren't even "accessible" in the same run! They are purely internal "gears" of a subcomponent. It also seems to me like you have retreated from to "intermediate activations of ff-components are not accessible in subsequent layers and because these are wider than the output not all information therein contained can make it into the output".

1Taylor G. Lunt3mo

I'll admit I am not confident about the nitty-gritty details of how LLMs work. My two core points (that LLMs are too wide vs. deep, and that LLMs are not recurrent and process in fixed layers) don't hinge on the "working memory" problems LLMs have. But I still think that seems to be true, based on my understanding. For LLMs, compute is separate from data, so the neural networks have to be recomputed each run, with the new token added. Some of their inputs may be cached, but that's just a performance optimization. Imagine an LLM is processing some text. At layer n, the feed-forward network has (somehow, and as the first layer that has done so) decided the feature definitely relates to hostility, and maybe relates to politics, but isn't really sure, so let's say the part about politics doesn't really make it into the output for that layer, because there's more important information to encode (it thinks). Then in the next run, the token "Trump" is added to the input. At layer n, the feed-forward network has to decide from scratch this token is related to politics. Nothing about the previous "this seems kinda political, not sure" decision is stored in the LLM, even though it was in actuality computed. In an alternative architecture, maybe the "brain area" associated with politics would be slightly active already, then the token "Trump" comes in, and now it's even more active.

2the gears to ascension3mo

it's all there for layer n+1's attention to process, though. at each new token position added to the end, we get to use the most recent token as the marginal new computation result produced by the previous token position's forward pass. for a token position t, for each layer n, n cannot read the output of layer n at earlier token i<t, but n+1 can read everything that happened anywhere in the past, and that gathering process is used to refine the meaning of the current token into a new vector. so, you can't have hidden state build up in the same way, and each token position runs a partially-shared algorithm. but you can have unhidden state build up, and that unhidden state gets you full turing completeness. ("brain area" equivalent would be "feature subspace" afaik. which is actually a slightly more general concept that also covers when a human brain lights up in ways that aren't regionally localized)

1Taylor G. Lunt3mo

Does this not mean the following though? 1. In layer n, the feed-forward network for token position t will potentially waste time doing things already done in layer n during tokens i<t. 2. This puts a constraint on the ability of different layers to represent different levels of abstraction, because now both layer n and n+1 need to be able to detect whether something "seems political", not just layer n. 3. This means the network needs to be deeper when we have more tokens, because token t needs to wait until layer n+1 to see if token t-1 had the feature "seems political", and token t+1 needs to wait until layer n+2 to see if token t had the feature "seems political", and so on.

2Radford Neal3mo

"...feed forward networks for the new token don't have access to the past feed-forward states of the other tokens..." This isn't correct. The attention mechanism can move information from the neural network outputs at previous times to the current time, that is then fed into the feedforward network for the current time. The basic transformer mechanism is to alternate cross-time attention computations with within-current-time neural network computations, over many layers. Without access to information from past times, performance would obviously be atrocious. In a sense, the KV cache that retains this information from past times is "just" an optimization, because the computations are (in theory, not always in practice) deterministic, so one could just redo them again for every previous token when predicting the next token (assuming the previously-generated tokens are retained). But that doesn't seem enough to support your argument. Of course, it's quite possible that the models don't attend very well to the past states, and so suffer to some extent from the issues you mention, but it's not a fundamental property of the architecture.

1Taylor G. Lunt3mo

Again, I could be misunderstanding, but it seems like only outputs of the neural networks are being stored and made available here, not the entire neural network state. This was the purpose of my cancer-curing hypothetical. Any conclusions made by the feed-forward network that don't make it into the output are lost. And the output is narrower than the widest part of the feed-forward network, so some information is "lost"/unavailable to subsequent tokens. Models not attending very well to past states could be an additional factor worth considering, but I'm not sure if that is or isn't true.

1Radford Neal3mo

OK, I think I more clearly see what you're saying. The hidden unit values in a feedforward block of the transformer at a previous time aren't directly available at the current time - only the inputs of that feedforward block can be seen. But the hidden unit values are deterministic functions of the inputs, so no information is lost. If these feedforward blocks were very deep, with many layers of hidden units, then keeping those hidden unit values directly available at later times might be important. But actually these feedforward blocks are not deep (even though the full network with many such blocks is deep), so it may not be a big issue - the computations can be redundantly replicated if it helps.

1Taylor G. Lunt3mo

I'm not really talking about true information loss, more like the computation getting repeated that doesn't need to be. And yes the feedforward blocks can be like 1 or 2 layers deep, so I am open to this being either a small or a big issue, depending on the exact architecture.

[-]testingthewaters3mo61

I want to register that I'm happy people are putting alternative, less rapid forecasts out there publicly, especially when they go against prevailing forum sentiments. I think this is a good thing :)

2Taylor G. Lunt3mo

Thank you!

[-]Daniel Kokotajlo3mo50

(I have a pet theory that this is what leads LLMs to overuse em dashes. They're a good way for one run of an LLM that disagrees with the previous run to pivot away from what the previous run was trying to say --- or not!)

Ha, nice!

[-]joseph_c3mo41

Why the Architecture of LLMs Makes Them Bad at Deep Thinking: They're Too Wide
GPT-3 is 96 layers deep (where each layer is only a few "operations"), but 49,152 "neurons" wide at the widest. This is an insanely wide, very shallow network. This is for good reasons: wide networks are easier to run efficiently on GPUs, and apparently deep networks are hard to train.

I don't find this argument compelling, because the human brain is much wider and possibly shallower than GPT-3. Humans have a conscious reaction time of about 200 milliseconds, while neurons take about 1ms to influence their neighbors, meaning an upper bound on the depth of a conscious reaction is 200 neurons.

1Taylor G. Lunt3mo

I expect humans are not doing deep thinking in a 200 ms conscious reaction.

[-]kavya3mo30

Benchmarks sound like a way to see how well LLMs do when up against reality, but I don't think they really are. Solving SAT problems or Math Olympiad problems only involves deep thinking if you haven't seen millions of math problems of a broadly similar nature.

Benchmarks are best-case analysis of model capabilities. A lot of companies benchmark max, but is this inherently bad? If the process is economically valuable and repetitive, I don't care how the LLM gets it done even if it is memorizing the steps.

2Taylor G. Lunt3mo

I think the benchmarks give a misleading impression of the capabilities of AI. It makes it seem like they're on the verge of being as smart as humans. It makes it sound like they're ready to take on a bunch of economically valuable activity that they're not, leading to the issues currently happening with bosses making their employees use LLMs, for example.

[-]StanislavKrym3mo30

wargames, expert feedback, experience at OpenAI, and previous forecasting successes

I strongly suspect that wargames were involved in a different part of the forecast, when one tried to find out what would happen once the superhuman coders were invented and stolen. Then both^[1] sides of the race would integrate the coders in order to make AI research as fast as possible. Next the sides would race hard, ignoring the need to ensure that the AIs are actually aligned. This ignorance would lead to the AIs becoming adversarially misaligned. While the scenari... (read more)

4Taylor G. Lunt3mo

I agree with where you believe the wargames were used. I think trend extrapolation from previous progress is a very unreliable way to predict progress. I would put more stock into a compelling argument for why progress will be fast/slow, like the one I hope I have provided. But even this is pretty low-confidence compared to actual proof, which nobody has. In this case, I don't buy extrapolating from past LLM advances because my model is compatible with fast progress up to a point followed by a slowdown, and the competing model isn't right just because it looks like a straight line when you plot it on a graph.

[-]avturchin3mo2-2

When we speak about very near catastrophes, reverse Doomsday argument is in play: I am unlikely to be in a position just before the catastrophe. If you think you are dying or that ASI is tomorrow, it is reasonable to be skeptical about it.

3Taylor G. Lunt3mo

I think this is true, but like the Lindy effect, is a very weak form of evidence that is basically immediately ignorable in light of any stronger evidence gained by actually examining object-level reality

2avturchin3mo

It can become very strong for shorter time predictions. If I say that the end of the world is tomorrow, it has very small a priori probability and very large update is needed to override it.

4Vladimir_Nesov3mo

Very large updates are abundant. I'm looking out of my window, and now I'm completely certain it's currently not raining, even though a priori the odds for that are far from complete certainty at any given time. Good priors are important, but their details often get washed away by the scale of concrete evidence. Base rates are more often than not just a result of updating on most of your data, rather than conceptualized from first principles, and then as a cherry on top you update on a little more data to get a better prediction.

[-]Noosphere893mo*20

To be completely honest, I think the best argument against AI 2027's scenario is that it relies on the assumption that we will soon be in a super-exponential progress regime, and we don't have much evidence that we are on a super-exponential trajectory soon, and we have reason to believe the data points that vindicate super-exponential trajectories are fundamentally temporary and non-extrapolatable.

We don't really need any more detailed argument than that, and we shouldn't go too much into details here, because of the fact that detailed stories must become... (read more)

1Taylor G. Lunt3mo

I am not adding more detail to my prediction, I'm adding more detail to my justification of that prediction, which doesn't make my prediction less probable. Unless you think predictions formed on the basis of little information are somehow more robust than predictions formed based on lots of information. As for denying the super-exponential trend, I agree. I don't put a lot of stock in extrapolating from past progress at all, because breakthroughs are discontinuous. That's why I think it's valuable to actually discuss the nature of the problem, rather than treating the problem as a black box we can predict by extrapolation.

[-]Daniel Kokotajlo3mo20

The authors of AI 2027 predict this will happen in March 2027, by the way. Th

Back in 23-24 when I would ask friends at OpenAI and anthropic and deepmind about how far away this sort of architecture was from being used in flagship models, they would generally say "a few years away." Hence the prediction for AI 2027. To be clear, this isn't exactly a large sample size scientific survey (it is neither large nor scientific). I definitely don't feel confident that it'll be in 2027 specifically. But I'd be curious to hear counterarguments for why we should be fairly confident it's more than a decade away, for example.

2Taylor G. Lunt3mo

I'm not confident neuralese is more than a decade away. That could happen by 2027 and I wouldn't be shocked. I don't think it'll be a magic bullet though. I expect less of an impedance mismatch between neuralese and the model than language and the model, but reducing that impedance mismatch is the only problem being solved by neuralese.

[-]Daniel Kokotajlo3mo20

Why the Architecture of LLMs Makes Them Bad at Deep Thinking: They're Too Wide

(1) They can do reasoning, i.e. use their Chain of Thought to make intellectual progress that they can't make within a single forward pass. This seems probably sufficient to me, given sufficient training to make good use of it. If not though, well, new architectures with recurrence/neuralese/etc. are being worked on by various groups and might start being competitive in the next few years. And if you are correct that this is the bottleneck to deep thinking, then soon the companie... (read more)

3Taylor G. Lunt3mo

I went into more detail about why I think this is more than 10 years away in a follow-up blog post: https://www.lesswrong.com/posts/F7Cdzn5mLrJvKkq3L/shallow-vs-deep-thinking-why-llms-fall-short

1Taylor G. Lunt3mo

To be clear, I think that basically any architecture is technically sufficient if you scale it up enough. Take ChatGPT, make it enormous, through oceans of data at it, and then allow it to store gigabytes of linguistic information. This is eventually a recipe for superintelligent AI if you scale it up enough. My intuition so far is that we basically haven't made any progress when it comes to deep thinking though, and as soon as LLMs start to deviate from learned patterns/heuristics, they hit a wall and become as smart as a ~shrimp. So I estimate the scale required to actually get anywhere with the current architecture is just too high. I think new architectures are needed. I don't think it will be as straightforward as "just use recurrence/neuralese", though moving away from the limitations of LLMs will be a necessary step. I think I'm going to write a follow-up blog post clarifying some of the limitations of the current architecture and why I think the problem is really hard, not just a straightforward matter of scaling up. I think it'll look something like: Each deep problem is its own exponential space, and exploring exponential spaces is very computationally expensive. We don't do that when running LLMs for a single-pass. We barely do it when running with chain of thought or whatever. We only do it when training, and training is computationally very expensive, because exploring exponential spaces is very computationally expensive. We should expect an AI that can generically solve deep problems will be very computationally expensive to run, let alone train. There isn't a cheap, general-purpose strategy for solving exponential problems, so you can't re-use progress from one to help with another necessarily. An AI that solves a new exponential problem will have to do the same kind of deep thinking AlphaGo Zero did in training when it played many games against itself, learning patterns and heuristics in the process. And that was a best-case, because you can simula

[-]piedrastrae3mo2-7

Glad to see some common sense/transparency about uncertainty. It seems to me that AGI/ASI is basically a black swan event — by definition unpredictable. Trying to predict it is a fool's errand, it makes more sense to manage its possibility instead.

It's particularly depressing when people who pride themselves in being rationalists basically ground their reasoning on "line has been going up, therefore it will keep going up", as if Moore's law mere existence means it extends to any and all technology-related lines in existence[1]. It's even more depress... (read more)

[-]james oofou3mo2-3

There’s a high bar to clear here: LLM capabilities have so far progressed at a hyper-exponential rate with no signs of a slowdown [1].

7-month doubling time (early models)
5.7-month doubling time (post-GPT-3.5)
4.2-month doubling time (post-o1)

So, an argument for the claim that we’re about to plateau has to be more convincing than induction from this strong pattern we’ve observed since at least the release of GPT-2 in February 2019.

Your argument does not pass this high bar. You have made the same kind of argument that has been made again and again (which have... (read more)

2Taylor G. Lunt3mo

I explained in my post that I believe the benchmarks are mainly measuring shallow thinking. The benchmarks include things like completing a single word of code or solving arithmetic problems. These unambiguously fall within what I described as shallow thinking. They measure existing judgement/knowledge, not the ability to form new insights. Deep thinking has not progressed hyper-exponentially. LLMs are essentially shrimp-level when it comes to deep thinking, in my opinion. LLMs still make extremely basic mistakes that a human 5-year-old would never make. This is undeniable if you actually use them for solving problems. The distinction between deep and shallow thinking is real and fundamental. Deep thinking is non-polynomial in its time complexity. I'm not moving the goalposts to include only whatever LLMs happen to be bad at right now. They have always been bad at deep thinking, and continue to be. All the gains measured by the benchmarks are gains in shallow thinking. I believe I have done so, by claiming deep thinking is of a fundamentally different nature than shallow thinking, and denying any significant progress has been made on this front. If you disagree, fine. Like I said, I can't prove anything, I'm just putting forward a hypothesis. But you don't get to say I've been proven wrong. If you want to come up with some way of measuring deep thinking and prove LLMs are or are not good at it, go ahead. Until that work has been done, I haven't been proven wrong, and we can't say either way. (Certain things are easy to measure/benchmark, and these things tend to also require only shallow thinking. Things that require deep thinking are hard to measure for the same reason they require deep thinking, and so they don't make it into benchmarks. The only way I know how to measure deep thinking is personal judgement, which obviously isn't convincing. But the fact this work is hard to do doesn't mean we just conclude that I'm wrong and you're right.)

2StanislavKrym3mo

Except that Grok 4 and GPT-5 arguably already didn't adhere to the faster doubling time. And I say "arguably" because of Grok failing some primitive tasks and Greenblatt's pre-release prediction of GPT-5's time horizon. While METR technically didn't confirm the prediction, METR itself acknowledged that it ran into problems when trying to calculate GPT-5's time horizon. Another thing to consider is that Grok 4's SOTA performance was achieved by using similar amounts of compute for pretraining and RL. What is Musk going to do to ensure that Grok 5 is AGI? Use some advanced architecture like neuralese? EDIT: you mention a 5.7-month doubling time post-GPT-3.5. But there actually was a plateau or slowdown between GPT-4 and GPT-4o which was followed by the GPT4o-o3 accelerated trend.

1james oofou3mo

I don't think there was a plateau. Is there a reason you're ignoring Claude models? Greenblatt's predictions don't seem pertinent.

1StanislavKrym3mo

Look at the METR graph more carefully. The Claudes which METR evaluated were released during the age which I called the GPT4o-o3 accelerated trend (except for Claude 3 Opus, but it wasn't SOTA even in comparison with the GPT4-GPT4o trend).

1james oofou3mo

With pre-RLVR models we went from a 36 second 50% time horizon to a 29 minute horizon. Between GPT-4 and Claude-3.5 Sonnet (new) we went from 5 minutes to 29 minutes. I've looked carefully at the graph, but I saw no signs of a plateau nor even a slowdown. I'll do some calculation to ensure I'm not missing anything obvious or deceiving myself. I don't any sign of a plateau here. Things were a little behind-trend right after GPT-4, but of course there will be short behind-trend periods just as there will be short above-trend periods, even assuming the trend is projectable. I'm not sure why you are starting from GPT-4 and ending at GPT-4o. Starting with GPT-3.5, and ending with Claude 3.5 (new) seems more reasonable since these were all post-RLHF, non-reasoning models. AFAIK the Claude-3.5 models were not trained based on data from reasoning models?

[-]kavya3mo20

Nobody has identified as step-by-step process to generate funny jokes, because such a process would probably be exponential in nature and take forever.

I tweeted about why I think AI isn't creative a few days ago. It seems like we have similar thoughts. A good idea comes from noticing a potential connection between ideas and recursively filling in the gaps through verbalizing/interacting with others. The compute for that right now is unjustified.

[-]O O3mo20

It seems like basically everything in this is already true today. Not sure what you’re predicting here.

3Rafael Harth3mo

I mean of course it's true today, right? It would be weird to make a prediction "AI can't do XX in the future" (and that's most of the predictions here) if that isn't true today.

3O O3mo

I just don't think there is much to this prediction. It takes a set of specific predictions, says none of it will happen, and by the nature of the conjunctive prediction, most will not happen. It would be more interesting to hear how AI will and will not progress rather than just denying an already unlikely to be perfect prediction. Inevitably they'll be wrong on some of these, but they'll look more right on the surface level because they will be right on most of them.

1Taylor G. Lunt3mo

If you think I'll be right on most of these, then I think you disagree with the AI 2027 predictions.

3O O3mo

AI progress can be rapid but the pathway to it may involve different capability unlocks. For example, it may be you automate work more broadly and then reinvest that into more compute/automate chipmaking itself). Or you can get the same unlocks without rapid progress. For example, you get a superhuman coder but run into different bottlenecks. I think it's pretty obvious AI progress won't completely stall out, so I don't think that's the prediction you're making? It's one thing to say AI progress won't be rapid and then give a specific story as to why. Later if you hit most of your marks, it'll look like a much more valuable prediction than saying simply it won't be rapid. (Same applies to AI 2027). The authors of AI 2027 made a pretty specific story before the release of ChatGPT and looked really prescient after the fact since it turned out to be mostly accurate.

2Taylor G. Lunt3mo

Most of my predictions are simply contradictions of the AI 2027 predictions, which are a well-regarded series of predictions for AI progress by the end of 2027. I am stating that I disagree and why.

[-]zeshen3mo10

Thanks for writing this up! I also want to register that I agree with all of this, maybe except for the part where AIs can't tell novel funny jokes - I expect this to be relatively easy. But of coursre it depends on the definition of 'novel'.

I struggled to do this exercise myself because when I looked at AI as a normal technology I felt like I basically agree with most of their thinking, but it was also hard to find concrete differences between their predictions and AI2027 at least in the near term. For example, for things like "LLMs are broadly ackn... (read more)

1Taylor G. Lunt3mo

It's funny everyone is doubting the funny jokes part. I view funny jokes as computationally hard to generate, probably because I've sat down and actually tried, and it doesn't seem fundamentally easier than coming up with brilliant essay ideas or whatever. But most people just have experience telling jokes in the moment, which is a different kind of non-deep activity. Maybe AI will be better at that, but not so good at e.g. writing an hour of stand-up comedy material that's truly brilliant? Yes, this is somewhat ambiguous I admit. I'm kind of fine with that though. I'm not placing any bets, I'm just trying to record what I think is going to happen, and the uncertainty in the wording reflects my own uncertainty of what I think is going to happen.

[-]nem3mo10

I'll register my prediction here as well. I largely agree with your projection, although my median case looks a little bit more advanced. Also, note that I am not vouching for your arguments.

75% - likely we live in a world that feels pretty normal. That is, similar to what you described or a bit more advanced, as mentioned.

Here are some places I differ from your predictions which might give insight into what I mean by "a little bit more advanced":
- In general, I anticipate more progress, both in terms of tech, and its integration into our... (read more)

2Taylor G. Lunt3mo

I definitely agree this will happen a lot sooner than superhuman coders. Growth of shallow reasoning ability is not enough to cure cancer and bring about the singularity, but some things will get weird. Some careers will probably vanish outright. I often describe some careers as "requiring general intelligence", and by that I mean requiring deep thinking. For example, writing fiction or doing AI research. In a sense, when any one of these falls to AI, they all fall. Until then, it'll only be the shallow jobs (transcription, for example) that can fall. Agreed

[-]Tiago Chamba3mo10

LLMs, on the other hand, are feed-forward networks. Once an LLM decides on a path, it's committed. It can't go back to the previous layer. We run the entire model once to generate a token. Then, when it outputs a token, that token is locked in, and the whole model runs again to generate the subsequent token, with its intermediate states ("working memory") completely wiped. This is not a good architecture for deep thinking.

It might be the case that LLMs develop different cognitive strategies to cope with this, such as storing the working memory on the CoT t... (read more)

1Taylor G. Lunt3mo

I didn't go into as much detail about this in my post as I planned to. I think relying on chain of thought for coping with the working memory problem isn't a great solution. The chain of thought is linguistic, and thereby linear/shallow compared to "neuralese". A "neuralese" chain of thought (non-linguistic information) would be better, but then we're still relying on an external working memory at every step, which is a problem if the working memory is smaller than the model itself. And potentially an issue even if the working memory is huge, because you'd have to make sure each layer in the LLM has access to what it needs from the working memory etc.

[-]kavya3mo10

But it's not the kind of thinking that leads to clever jokes, good business ideas, scientific breakthroughs, etc.

Could an LLM write a show like Seinfeld? This might actually be my test for whether I accept that LLMs are truly clever. Anyone who's watched it knows that Seinfeld was great because of two reasons: (1) Seinfeld broke all the rules of a sitcom. (2) The show follows very relatable interactions on the most non-trivial issues between people and runs it with it for seasons. There is no persistent plot or character arc. You have humans playing themselves. And yet it works.

1Taylor G. Lunt3mo

I am pretty sure current LLMs could not write any competitive TV scripts.

[-]StanislavKrym3mo10

AI still can't tell novel funny jokes, write clever prose, generate great business ideas, invent new in-demand products, or generate important scientific breakthroughs, except by accident.

Except that JustisMills on June 17 made a post titled "Ok, AI Can Write Pretty Good Fiction Now".

2Taylor G. Lunt3mo

I remember when ChatGPT came out, people were very impressed with how well it could write poetry. Except the poetry was garbage. They just couldn't tell, because they lacked the taste in poetry to know any better. I think the same thing still applies to fiction/prose generated by ChatGPT. It's still not good, but some people can't tell. To be clear about my predictions, I think "okay"/"acceptable" writing (fiction and nonfiction) will become easier for AI to generate in the next 2 years, but "brilliant"/"clever" will not really.

[-]Aliaksei Yaletski (Tiendil)3mo0-1

Yep, these conclusions intersect with the prognosis I made for myself at the end of 2024:

- There will be no technological singularity.
- Neural networks won't change conceptually over the next 10 years.
- We won't build strong AI based on just one or a few neural networks.
- Neural networks on their own won’t take jobs away.
- Robots will not rapidly and massively replace manual labor.
- Works of proactive professionals are safe for the next 10 years.

My predictions are based on:

- The view on the current neural networks as on one-more-building-block-from-many, e... (read more)

3Taylor G. Lunt3mo

I agree with you that the types of neural networks currently being used at scale are not sufficient for artificial superintelligence (unless perhaps scaled to an absurd level). I am not as confident that businesses won't continue investing in risky experiments. For example, experiments into AI that does not "separate training from their operational mode", or experiments into recurrent architectures, are currently being done. I definitely don't agree with your claim in the blog post that even if strong AI comes, we will all simply adapt. Your arguments about more mature people finding common ground with less mature people ignores the fact that these people either belong to the same family, or the same legal system. A strong AI will not necessarily love you or care about following the law. In cases where humans don't have those constraints, they tend not to always be so nice to one another. I think AI risk is an existential threat, if superintelligent AI does show up.

1Aliaksei Yaletski (Tiendil)3mo

My statement about the lack of huge investments in risky experiments may really be too strong. In the end, we speak about people, and they are always unpredictable. Partially, I formulated it that way to acquire a strong validation point for multiple of my personal models of how the world and society work. However, I still believe it is more probable than the opposite statement. Speaking about strong AI. The analogy between child-parent relations is the simplest I found for the post. The history of humanity has a long record of communication between different societies on different levels of maturity and with different cultures, of course. Those contacts didn't always go well, but they also didn't lead to the full extinction of one of the parties (in most cases). Since most likely a superintelligent will be produced on the basis of the information humanity produced (we don't have any other source of data), I believe it will operate in a similar way => a struggle is possible, maybe a serious one even, but in the end we will adapt to each other. However, this logic is relevant in situations where a strong AI appears instantly. I don't expect that, given there are no historical precedents for something complex appearing instantly without a long process of increasing complexity. What I expect, assuming that strong AI will appear, is a long process of evolving AI tools with increasing complexity to each of which humanity will adapt, like it already adapted to LLMs. At some point, those tools began uniting into a kind of smaller AIs, and we'll adapt to them, and they will adapt to us. And so on, until we reach a point when the complexity of those tools will be incomparably higher than the complexity of humanity. But by that time, we will have already adapted to them. I.e., if such a thing ever happens, it will be a long enough co-evolution, rather than an instant rise of a superintelligent being and obliteration of humanity.

AI 2027 predictions by Daniel Kikotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, Romeo Dean. Published in April 2025. ↩︎
Actually, I think LLMs are already plateauing, but focus on out-of-model (agency) or reasoning progress has covered this up by giving AI extra capabilities without substantially improving the model's actual within-model intelligence. ↩︎
Scott Aaronson points out that if P=NP, the "world would be a profoundly different place than we usually assume it to be. There would be no special value in “creative leaps,” no fundamental gap between solving a problem and recognizing the solution once it’s found. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss; everyone who could recognize a good investment strategy would be Warren Buffett." This is basically the distinction I am referring to here. ↩︎
Therapy would be one example of conversations that involve more deep thinking. Long pauses are an indication lots of deep thinking is happening. ↩︎
This is not quite the distinction between System 1 and System 2 thinking, from what I understand. Shallow thinking encompasses all System 2 (consicous) thinking and some System 1 thinking, and deep thinking is just the kind of System 1 thinking you do when focusing. The point is that shallow thinking is algorithmically/computationally less demanding. I don't actually know if shallow and deep thinking are fundamentally different, but it doesn't matter for this discussion. It only matters that they're practically different. ↩︎
By the way, Daniel Kokotajlo has adjusted his estimates slightly from 2027 to 2028 since publishing in response to some criticism. I am responding to the original estimates, though the original estimates had large error margins anyway, and I don't think shifting from 2027 to 2028 meaningfully changes any of my criticisms. ↩︎
SC: An AI system that can do the job of the best human coder on tasks involved in AI research but faster, and cheaply enough to run lots of copies." Specifically: "An AI system for which the company could run with 5% of their compute budget 30x as many agents as they have human researchers, each which is on average accomplishing coding tasks involved in AI research... at 30x the speed... of the company’s top coder." SAR: "An AI system that can do the job of the best human AI researcher but faster, and cheaply enough to run lots of copies..." Specifically: "An AI system that can do the job of the best human AI researcher but 30x faster and with 30x more agents, as defined above in the superhuman coder milestone..." I think these definitions are basically fine even though they assume coders are basically fungible, only differing in the speed it takes to accomplish a task. ↩︎
For an example of someone's intuition conflicting with their conscious analysis, read my new short story Suloki and the Magic Stones ↩︎
https://garymarcus.substack.com/p/the-ai-2027-scenario-how-realistic ↩︎
Or at least, no good data, at least not for the Timeline/Takeoff sections. ↩︎

LESSWRONG
LW

LESSWRONG
LW

37

My AI Predictions for 2027

37

37

My predictions for AI by the end of 2027

Why the Architecture of LLMs Makes Them Bad at Deep Thinking: They're Too Wide

My predictions for AI by the end of 2027

Justification

Shallow vs. Deep Thinking

Noticing Where LLMs Fail

Why the Architecture of LLMs Makes Them Bad at Deep Thinking: They're Too Wide

LLMs Are Also Too Linear

What's wrong with AI 2027

The Takeoff Forecast is Based on Guesswork

I Don't Take These Predictions Seriously

The Presentation was Misleading

Deep Thinking vs. Shallow Thinking For Making Predictions

Was AI 2027 a Valuable Exercise?

Conclusion