NunoSempere

Forecasting Thread: AI Timelines

Here is my own answer.

- It takes as a starting point datscilly's own prediction, i.e., the result of applying Laplace's rule from the Dartmouth conference. This seems like the most straightfoward historical base rate / model to use, and on a meta-level I trust datscilly and I've worked with him before.
- I then substract some probability from the beginning and move it towards the end because I think it's unlikely we'll get human parity in the next 5 years. In particular, even Daniel Kokotajlo, the most bullish among the other predictors puts his peak somewhere around 2025.
- I then apply some smoothing.

My resulting distribution looks similar to the current aggregate (and this I noticed after building it)

*Datscilly's prediction*:

*My prediction*:

*The previous aggregate*:

Something I don't like about the other predictions are:

- Not long enough tails. There have been AI winters before; there could be AI winters again. Shit happens.
- Very spiky maximums. I get that specific models can provide sharp predictions, but the question seems hard enough that I'd expect there to be a large amount of model error. I'd also expect predictions which take into account multiple models to do better.
- Not updating on other predictions. Some of the other forecasters seem to have one big idea, rather than multiple uncertainties.

Things that would change my mind:

At the five minute level:

- Getting more information about Daniel Kokotajlo's models. On a meta-level, learning that he is a superforecaster.
- Some specific definitions of "human level".

At the longer-discussion level:

- Object level arguments about AI architectures
- Some information about whether experts believe that current AI methods can lead to AGI.
- Some object level arguments about Moore's law. I.e., by which year does Moore's law predict we'll have much more computing power than the higher estimates for the human Brain?

I'm also uncertain about what probability to assign to AGI after 2100.

I might revisit this as time goes on.

Forecasting Thread: AI Timelines

The location of the bump could be estimated by using Daniel Kokotajlo's answer as the "earliest plausible AGI."

Forecasting Thread: AI Timelines

Is this your inside view, or your "all things considered" forecast? I.e., how do you update, if at all, on other people disagreeing with you?

Forecasting Thread: AI Timelines

Is your P(AGI | no AGI before 2040) really that low?

Forecasting Thread: AI Timelines

That small tail at the end feels really suspicious. I.e., it implies that if we haven't reached AGI by 2080, then we probably won't reach it at all. I feel like this might be an artifact of specifying a small number of bins on elicit, though.

Forecasting Thread: AI Timelines

That sharp peak feels really suspicious.

Forecasting Thread: AI Timelines

Your prediction has the interesting property that (starting in 2021), you assign more probability to the next n seconds/ n years than to any subsequent period of n seconds/ n years.

Specifically, I think your distribution assigns too much probability about AGI in the immediately next three months/year/5 years, but I feel like we do have a bunch of information that points us away from such short timelines. If one takes that into account, then one might end up with a bump, maybe like so, where the location of the bump is debatable, and the decay afterwards is per Laplace's rule.

Is there an easy way to turn a LW sequence into an epub?

Use the LW GraphQL API (https://www.lesswrong.com/posts/LJiGhpq8w4Badr5KJ/graphql-tutorial-for-lesswrong-and-effective-altruism-forum) to query for the html of the posts, and then use something like pandoc to translate said html into latex, and then to epub.

Link to the graphQL API

The command needed to get a particular post:

```
{
post(input: {
selector:{
_id: "ZyWyAJbedvEgRT2uF"
}
}) {
result {
htmlBody
}
}
}
```

Aggregating forecasts### Geometric mean of the odds = mean of the evidences.

Suppose you have probabilities in odds form; 1: 2^a and 1:2^b, corresponding to a and b bits, respectively. Then the geometric mean of the odds is 1: sqrt(2^a * 2^b) = 1 : 2^((a+b)/2), corresponding to ((a+b)/2) bits; the midpoint in the evidences.

For some more background as to why bits are the natural unit of probability, see for example this arbital article, or search *Probability Theory, the Logic of Science*. Bits are additive: you can just add or substract bits as you encounter new evidence, and this is a pretty big "wink wink, nod, nod, nudge, nudge" as to why they'd be the natural unit.

In any case, if person A has seen a bits of evidence, of which a' are unique, and person B has seen b bits of evidence, of which b' are unique, and they have both seen s' bits of shared evidence, then you'd want to add them, to end up at a'+b'+s', or a + b -2s'. So maybe in practice (a+b)/2 = s' + (a'+b')/2 ~ a'+b'+s', when a' + b' small (or overestimated, which imho seems to often be the case; people overestimate the importance of their own private information; there is also some literature on this).

This corresponds to the intuition that if someone is at 5%, and someone else is at 3% for totally unrelated reasons, the aggregate should be lower than that. And this would be a justification for Tetlock's extremizing.

Anyways, in practice, you might estimate s' as the historical base rate (to which you and your forecasters have access), and take a' b' as the deviation from that.

One can also construe Lynyrd Skynyrd song

Simple Manto be talking about this kind of thing.