One way to measure how good LLMs are which is gaining traction and validity is the following:
Let T(t) be the time it takes for a human to the task t.
Empirically, a given LLM will be able to fully accomplish (without extra human correction) almost all tasks t such that T(t)<K and will not be able to accomplish, without human help, tasks f such that T(f)>K.
Thus we can measure how good an LLM by see what is the time it would take a human to accomplish the hardest tasks that it can do.
A minimal non-trivial amount of significant improvement to an AI system corresponds to roughly one NeurIPS (or other such conference) paper.
Writing such a paper typically requires on the order of 1 year of human work.
Using a 6 months doubling time of the LLM Moore's Law, that means that in 7 years an LLM will be able to independently write a NeurIPS paper.
Hence in 7 years, i.e. 2033, it will possible to create a recursively self-improving AI.
Initial Improvement Rate:
The relevant questions to ask to estimate the initial takeoff rate are the following:
What is the size of the improvement? In the reasoning, we already set that to the equivalent of a NeurIPS paper.
How much time will it take the AI to produce this improvement? Assuming that the bottleneck for producing this improvement is running the experiments, currently experiments to produce one NeurIPS paper typically take one week[2], thus we can expect the initial self-improvement rate to be roughly on the order of one NeurIPS paper per week. Which is not the hardest of takeoffs but also not the softest of takeoffs as we will see in the next paragraph. Also, this gives us a clue as to the limiting factor for containing ASI: limiting access to compute.
Hard or Soft Takeoff?
Assuming, after reaching self-improvement that compute is not a limit, either due to access to enough compute or because the AI can find ways to make more efficient use of the compute it has, and assuming continued exponential increase, meaning that the first week it self-improves to the tune of one NeurIPS paper, the second week to the tune of 2 NeurIPS papers, the 3rd week 4 NeurIPS papers and so on. Then, on the 12th or 13th week it will improve to the tune of a whole NeurIPS conference (i.e. roughly 5000 papers).
This reasoning thus predicts the takeoff speed to be on the order of months rather than decades or minutes.
Source: A paper entitled something along the lines of "BRIDGE: Bridging Reasoning In Difficulty Gap between Entities" which will soon be published at ICLR 2026 but does not yet seem to be publicly available. This paper broadly agrees with a prior paperMeasuring AI Ability to Complete Long Tasks which calculates 7 months doubling time instead of 6.
Context:
Data: [1]
Figure from Measuring AI Ability to Complete Long Tasks
Reasoning:
Initial Improvement Rate:
The relevant questions to ask to estimate the initial takeoff rate are the following:
What is the size of the improvement? In the reasoning, we already set that to the equivalent of a NeurIPS paper.
How much time will it take the AI to produce this improvement? Assuming that the bottleneck for producing this improvement is running the experiments, currently experiments to produce one NeurIPS paper typically take one week[2], thus we can expect the initial self-improvement rate to be roughly on the order of one NeurIPS paper per week. Which is not the hardest of takeoffs but also not the softest of takeoffs as we will see in the next paragraph. Also, this gives us a clue as to the limiting factor for containing ASI: limiting access to compute.
Hard or Soft Takeoff?
Assuming, after reaching self-improvement that compute is not a limit, either due to access to enough compute or because the AI can find ways to make more efficient use of the compute it has, and assuming continued exponential increase, meaning that the first week it self-improves to the tune of one NeurIPS paper, the second week to the tune of 2 NeurIPS papers, the 3rd week 4 NeurIPS papers and so on. Then, on the 12th or 13th week it will improve to the tune of a whole NeurIPS conference (i.e. roughly 5000 papers).
This reasoning thus predicts the takeoff speed to be on the order of months rather than decades or minutes.
Source: A paper entitled something along the lines of "BRIDGE: Bridging Reasoning In Difficulty Gap between Entities" which will soon be published at ICLR 2026 but does not yet seem to be publicly available. This paper broadly agrees with a prior paper Measuring AI Ability to Complete Long Tasks which calculates 7 months doubling time instead of 6.
See for example the "Business of building AI" email from https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman-and-openai-blog .