Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

What's a good probability distribution family (e.g. "log-normal") to use for AGI timelines?

7Daniel Kokotajlo

4Zac Hatfield-Dodds

1InquilineKea

3davidad

3GuySrinivasan

2avturchin

1Vivek Hebbar

2avturchin

7habryka

1David Scott Krueger (formerly: capybaralet)

2David Scott Krueger (formerly: capybaralet)

New Answer

New Comment

5 Answers sorted by

My distribution has median 2030ish. That means it peaks a bit before then, and then has a long tapering tail.

I'm fairly confident in this, because I've done lots of thinking and research about it. I have talked to smart experts who disagree and think I could maybe even pass an ITT for them. That said, they could probably pass an ITT for me too.

I think that the best available starting point is the Cotra / OpenPhil report on biological anchors.

Personally I treat this as an upper bound on how much compute it might take, and hold the estimates of when compute will become available quite lightly. Nonetheless, IMO serious discussion of timelines should at least be able to describe specific disagreements with the report and identify which differing assumptions give rise to each.

Log-normal is a good first guess, but I think its tails are too small (at both ends).

Some alternatives to consider:

- Erlang distribution (by when will k Poisson events have happened?), or its generalization, Generalized gamma distribution
- Frechet distribution (what will be the max of a large number of i.i.d. samples?) or its generalization, Generalized extreme value distribution
- Log-logistic distribution (like log-normal, but heavier-tailed), or its generalization, Singh–Maddala distribution

Of course, the best Bayesian forecast you could come up with, derived from multiple causal factors such as hardware and economics in addition to algorithms, would probably score a bit better than any simple closed-form family like this, but I'd guess literally only about 1 to 2 bits better (in terms of log-score).

It looks like normal distribution around some mean value (2040+-10) is typically assumed, but it wrong, as obviously AGI probability is growing exponentially.

So some exponential distribution is needed, in which mean value is not far from unity. E.g.: 10 percent for 2030, 50 percent for 2035 and 99 percent for 2040.

This seems very wrong to me:

- What do you mean by "probability is growing exponentially"? Cumulative probabilities cannot grow exponentially forever, since they are bounded by [0,1].
- It seems salvageable if what you mean by "probability" is (dP/dt)/P, where P is cumulative probability.
- Can you clarify what gears-level model leads to exponential growth in "probability"?
- Even if you pick a gears-level model of AGI probability which resembles an exponential within the domain [0, 0.99], you should have a lot of uncertainty over the growth rate. Thus you

2

Ok, on gear level I have the following model: imagine that Moore's law is true and the AI risk is proportional to the amount of available compute, via some unknown coefficient. In that model, after the cumulative risk reaches 50 per cent, 99 per cent is very near.
But we need some normalisation to escape probabilities above one, for example, by using odds ratio.
Also, cumulative risk is exponential even for constant probability density. If probability density is exponential, cumulative risk is double exponential.
It all means that if AI risk some sizeable digit, 10 per cent or 50 percent, there is only few years until almost certain end. In other words, there is no difference between two claims "10 per cent of AI in 2040" and "90 per cent of AI in 2040", as both mean that the end will be in 2040s.