No one has ever seen an AGI takeoff, so any attempt to understand it must use these outside view considerations.

—[Redacted for privacy]

What? That’s exactly backwards. If we had lots of experience with past AGI takeoffs, using the outside view to predict the next one would be a lot more effective.

—My reaction

Two years ago I wrote a deep-dive summary of Superforecasting and the associated scientific literature. I learned about the “Outside view” / “Inside view” distinction, and the evidence supporting it. At the time I was excited about the concept and wrote: “...I think we should do our best to imitate these best-practices, and that means using the outside view far more than we would naturally be inclined.

Now that I have more experience, I think the concept is doing more harm than good in our community. The term is easily abused and its meaning has expanded too much. I recommend we permanently taboo “Outside view,” i.e. stop using the word and use more precise, less confused concepts instead. This post explains why. 

What does “Outside view” mean now?

Over the past two years I’ve noticed people (including myself!) do lots of different things in the name of the Outside View. I’ve compiled the following lists based on fuzzy memory of hundreds of conversations with dozens of people:

Big List O’ Things People Describe As Outside View:

  • Reference class forecasting, the practice of computing a probability of an event by looking at the frequency with which similar events occurred in similar situations. Also called comparison class forecasting. [EDIT: Eliezer rightly points out that sometimes reasoning by analogy is undeservedly called reference class forecasting; reference classes are supposed to be held to a much higher standard, in which your sample size is larger and the analogy is especially tight.]
  • Trend extrapolation, e.g. “AGI implies insane GWP growth; let’s forecast AGI timelines by extrapolating GWP trends.”
  • Foxy aggregation, the practice of using multiple methods to compute an answer and then making your final forecast be some intuition-weighted average of those methods.
  • Bias correction, in others or in oneself, e.g. “There’s a selection effect in our community for people who think AI is a big deal, and one reason to think AI is a big deal is if you have short timelines, so I’m going to bump my timelines estimate longer to correct for this.”
  • Deference to wisdom of the many, e.g. expert surveys, or appeals to the efficient market hypothesis, or to conventional wisdom in some fairly large group of people such as the EA community or Western academia.
  • Anti-weirdness heuristic, e.g. “How sure are we about all this AI stuff? It’s pretty wild, it sounds like science fiction or doomsday cult material.”
  • Priors, e.g. “This sort of thing seems like a really rare, surprising sort of event; I guess I’m saying the prior is low / the outside view says it’s unlikely.” Note that I’ve heard this said even in cases where the prior is not generated by a reference class, but rather from raw intuition.
  • Ajeya’s timelines model (transcript of interview, link to model)
  • … and probably many more I don’t remember

Big List O’ Things People Describe As Inside View:

  • Having a gears-level model, e.g. “Language data contains enough structure to learn human-level general intelligence with the right architecture and training setup; GPT-3 + recent theory papers indicate that this should be possible with X more data and compute…”
  • Having any model at all, e.g. “I model AI progress as a function of compute and clock time, with the probability distribution over how much compute is needed shifting 2 OOMs lower each decade…”
  • Deference to wisdom of the few, e.g. “the people I trust most on this matter seem to think…”
  • Intuition-based-on-detailed-imagining, e.g. “When I imagine scaling up current AI architectures by 12 OOMs, I can see them continuing to get better at various tasks but they still wouldn’t be capable of taking over the world.”
  • Trend extrapolation combined with an argument for why that particular trend is the one to extrapolate, e.g. “Your timelines rely on extrapolating compute trends, but I don’t share your inside view that compute is the main driver of AI progress.”
  • Drawing on subject matter expertise, e.g. “my inside view, based on my experience in computational neuroscience, is that we are only a decade away from being able to replicate the core principles of the brain.”
  • Ajeya’s timelines model (Yes, this is on both lists!)
  • … and probably many more I don’t remember

What did “Outside view” mean originally?

As far as I can tell, it basically meant reference class forecasting. Kaj Sotala tells me the original source of the concept (cited by the Overcoming Bias post that brought it to our community) was this paper. Relevant quote: “The outside view is ... essentially ignores the details of the case at hand, and involves no attempt at detailed forecasting of the future history of the project. Instead, it focuses on the statistics of a class of cases chosen to be similar in relevant respects to the present one.” If you look at the text of Superforecasting, the “it basically means reference class forecasting” interpretation holds up. Also, “Outside view” redirects to “reference class forecasting” in Wikipedia.

To head off an anticipated objection: I am not claiming that there is no underlying pattern to the new, expanded meanings of “outside view” and “inside view.” I even have a few ideas about what the pattern is. For example, priors are sometimes based on reference classes, and even when they are instead based on intuition, that too can be thought of as reference class forecasting in the sense that intuition is often just unconscious, fuzzy pattern-matching, and pattern-matching is arguably a sort of reference class forecasting. And Ajeya’s model can be thought of as inside view relative to e.g. GDP extrapolations, while also outside view relative to e.g. deferring to Dario Amodei.

However, it’s easy to see patterns everywhere if you squint. These lists are still pretty diverse. I could print out all the items on both lists and then mix-and-match to create new lists/distinctions, and I bet I could come up with several at least as principled as this one.

This expansion of meaning is bad

When people use “outside view” or “inside view” without clarifying which of the things on the above lists they mean, I am left ignorant of what exactly they are doing and how well-justified it is. People say “On the outside view, X seems unlikely to me.” I then ask them what they mean, and sometimes it turns out they are using some reference class, complete with a dataset. (Example: Tom Davidson’s four reference classes for TAI). Other times it turns out they are just using the anti-weirdness heuristic. Good thing I asked for elaboration! 

Separately, various people seem to think that the appropriate way to make forecasts is to (1) use some outside-view methods, (2) use some inside-view methods, but only if you feel like you are an expert in the subject, and then (3) do a weighted sum of them all using your intuition to pick the weights. This is not Tetlock’s advice, nor is it the lesson from the forecasting tournaments, especially if we use the nebulous modern definition of “outside view” instead of the original definition. (For my understanding of his advice and those lessons, see this post, part 5. For an entire book written by Yudkowsky on why the aforementioned forecasting method is bogus, see Inadequate Equilibria, especially this chapter. Also, I wish to emphasize that I myself was one of these people, at least sometimes, up until recently when I noticed what I was doing!)

Finally, I think that too often the good epistemic standing of reference class forecasting is illicitly transferred to the other things in the list above. I already gave the example of the anti-weirdness heuristic; my second example will be bias correction: I sometimes see people go “There’s a bias towards X, so in accordance with the outside view I’m going to bump my estimate away from X.” But this is a different sort of bias correction. To see this, notice how they used intuition to decide how much to bump their estimate, and they didn’t consider other biases towards or away from X. The original lesson was that biases could be corrected by using reference classes. Bias correction via intuition may be a valid technique, but it shouldn’t be called the outside view.

I feel like it’s gotten to the point where, like, only 20% of uses of the term “outside view” involve reference classes. It seems to me that “outside view” has become an applause light and a smokescreen for over-reliance on intuition, the anti-weirdness heuristic, deference to crowd wisdom, correcting for biases in a way that is itself a gateway to more bias... 

I considered advocating for a return to the original meaning of “outside view,” i.e. reference class forecasting. But instead I say:

Taboo Outside View; use this list of words instead

I’m not recommending that we stop using reference classes! I love reference classes! I also love trend extrapolation! In fact, for literally every tool on both lists above, I think there are situations where it is appropriate to use that tool. Even the anti-weirdness heuristic.

What I ask is that we stop using the words “outside view” and “inside view.” I encourage everyone to instead be more specific. Here is a big list of more specific words that I’d love to see, along with examples of how to use them:

  • Reference class forecasting
    • “I feel like the best reference classes for AGI make it seem pretty far away in expectation.”
    • “I don’t think there are any good reference classes for AGI, so I think we should use other methods instead.”
  • Analogy
    • Analogy is like a reference class but with lower standards; sample size can be small and the similarities can be weaker.
    • “I’m torn between thinking of AI as a technology vs. as a new intelligent species, but I lean towards the latter.”
  • Trend extrapolation
    • “The GWP trend seems pretty relevant and we have good data on it”
    • “I claim that GPT performance trends are a better guide to AI timelines than compute or GWP or anything else, because they are more directly related.”
  • Foxy aggregation (a.k.a. multiple methods)
    • “OK that model is pretty compelling, but to stay foxy I’m only assigning it 50% weight.”
  • Bias correction
    • “I feel like things generally take longer than people expect, so I’m going to bump my timelines estimate to correct for this. How much? Eh, 2x longer seems good enough for now, but I really should look for data on this.”
  • Deference
    • “I’m deferring to the markets on this one.”
    • “I think we should defer to the people building AI.”
  • Anti-weirdness heuristic
    • “How sure are we about all this AI stuff? The anti-weirdness heuristic is screaming at me here.”
  • Priors
    • “This just seems pretty implausible to me, on priors.”
    • (Ideally, say whether your prior comes from intuition or a reference class or a model. Jia points out “on priors” has similar problems as “on the outside view.”)
  • Independent impression
    • i.e. what your view would be if you weren’t deferring to anyone.
    • “My independent impression is that AGI is super far away, but a lot of people I respect disagree.”
  • “It seems to me that…” 
    • i.e. what your view would be if you weren’t deferring to anyone or trying to correct for your own biases.
    • “It seems to me that AGI is just around the corner, but I know I’m probably getting caught up in the hype.”
    • Alternatively: “I feel like…”
    • Feel free to end the sentence with “...but I am not super confident” or “...but I may be wrong.”
  • Subject matter expertise
    • “My experience with X suggests…”
  • Models
    • “The best model, IMO, suggests that…” and “My model is…”
    • (Though beware, I sometimes hear people say “my model is...” when all they really mean is “I think…”)
  • Wild guess (a.k.a. Ass-number)
    • “When I said 50%, that was just a wild guess, I’d probably have said something different if you asked me yesterday.”
  • Intuition
    • “It’s not just an ass-number, it’s an intuition! Lol. But seriously though I have thought a lot about this and my intuition seems stable.”

Conclusion

Whenever you notice yourself saying “outside view” or “inside view,” imagine a tiny Daniel Kokotajlo hopping up and down on your shoulder chirping “Taboo outside view.” 

 

Many thanks to the many people who gave comments on a draft: Vojta, Jia, Anthony, Max, Kaj, Steve, and Mark. Also thanks to various people I ran the ideas by earlier.

280

23 comments, sorted by Highlighting new comments since Today at 10:58 AM
New Comment

The book Noise by Daniel Kahneman et al sometimes uses the terms statistical thinking and causal thinking as substitutes for outside and inside views.

These terms seem better at reminding me what the categories are meant to be, and why evolution prepared us less for one of them. But they still leave some confusion about how to draw the boundary between the concepts.

Eliezer also wrote an interesting comment on the EA Forum crosspost of this article, copying it here for convenience:

I worriedly predict that anyone who followed your advice here would just switch to describing whatever they're doing as "reference class forecasting" since this captures the key dynamic that makes describing what they're doing as "outside viewing" appealing: namely, they get to pick a choice of "reference class" whose samples yield the answer they want, claim that their point is in the reference class, and then claiming that what they're doing is what superforecasters do and what Philip Tetlock told them to do and super epistemically virtuous and anyone who argues with them gets all the burden of proof and is probably a bad person but we get to virtuously listen to them and then reject them for having used the "inside view".

My own take:  Rule One of invoking "the outside view" or "reference class forecasting" is that if a point is more dissimilar to examples in your choice of "reference class" than the examples in the "reference class" are dissimilar to each other, what you're doing is "analogy", not "outside viewing".

All those experimental results on people doing well by using the outside view are results on people drawing a new sample from the same bag as previous samples.  Not "arguably the same bag" or "well it's the same bag if you look at this way", really actually the same bag: how late you'll be getting Christmas presents this year, based on how late you were in previous years.  Superforecasters doing well by extrapolating are extrapolating a time-series over 20 years, which was a straight line over those 20 years, to another 5 years out along the same line with the same error bars, and then using that as the baseline for further adjustments with due epistemic humility about how sometimes straight lines just get interrupted some year.  Not by them picking a class of 5 "relevant" historical events that all had the same outcome, and arguing that some 6th historical event goes in the same class and will have that same outcome.

This post was easy to read. Far easier than a post like this usually manages to be. And It was definitely useful. I realized i have been messing up in some ways described in the post.

The achievement of easiness is due to the use of specific examples everywhere.

I really liked this post, thanks so much for writing it. I have been very frustrated by people conflating these different meanings of "outside view" in the past.

Good post. I myself have gotten into the habit of referring to an outside view instead of the outside view.

I really like this post, it feels like it draws attention to an important lack of clarity.

One thing I'd suggest changing: when introducing new terminology, I think it's much better to use terms that are already widely comprehensible if possible, than terms based on specific references which you'd need to explain to people who are unfamiliar in each case.

So I'd suggest renaming 'ass-number' to wild guess and 'foxy aggregation' to multiple models or similar.

Thanks! Good point, I'll edit those in!

I like 'ass number' because it points at the actual experience / cognitive process behind these numbers. 'Wild guess' is vaguer -- e.g., if I'm using a standard statistical technique to estimate a number from other (observed) numbers, then I wouldn't call that an 'ass number', but I might still call it a 'wild guess' if the output is extremely uncertain.

I always hear "swag" ("scientific wild-ass guess"), which manages to incorporate both "ass" and "wild guess".

It should be mentioned that Eliezer's last (known) big release Inadequate Equilibria was pretty much a correction of pathological outside-viewing. The thrust can be summed up as "sometimes you can't beat the market, sometimes you can, it's important to know which situation you're in instead of just pathetically assuming the former all of the time."

Fwiw I'm not aware of using or understanding 'outside view' to mean something other than basically reference class forecasting (or trend extrapolation, which I'd say is the same). In your initial example, it seems like the other person is using it fine - yes, if you had more examples of an AGI takeoff, you could do better reference class forecasting, but their point is that in the absence of any examples of the specific thing, you also lack other non-reference-class-forecasting methods (e.g. a model), and you lack them even more than you lack relevant reference classes. They might be wrong, but it seems like a valid use. I assume you're right that some people do use the term for other stuff, because they say so in the comments, but is it actually that common?

I don't follow your critique of doing an intuitively-weighted average of outside view and some inside view. In particular, you say 'This is not Tetlock’s advice, nor is it the lesson from the forecasting tournaments...'. But in the  blog post section that you point to, you say 'Tetlock’s advice is to start with the outside view, and then adjust using the inside view.', which sounds like he is endorsing something very similar, or a superset of the thing you're citing him as disagreeing with?

Thanks for the pushback!

Fwiw I'm not aware of using or understanding 'outside view' to mean something other than basically reference class forecasting (or trend extrapolation, which I'd say is the same). ... I assume you're right that some people do use the term for other stuff, because they say so in the comments, but is it actually that common?

Fair enough if your experience is different than mine. Lots of people I've talked to seem to have had experiences similar to mine. To be clear it's not just that I've seen other people abusing the concept, I think I've caught myself doing it too on several occasions. I think quite a lot of things can be thought of as a form of reference class forecasting if you stretch it enough, but that's true of a lot of things, e.g. quite a lot of things can be thought of as a form of modelling, or logical deduction, or intuition, if you stretch those concepts far enough.

In your initial example, it seems like the other person is using it fine - yes, if you had more examples of an AGI takeoff, you could do better reference class forecasting, but their point is that in the absence of any examples of the specific thing, you also lack other non-reference-class-forecasting methods (e.g. a model), and you lack them even more than you lack relevant reference classes. They might be wrong, but it seems like a valid use.

I guess we are getting into the question of what the charitable interpretation of their statement is. We could interpret it in the way you mentioned, but then it would be a pretty big and obvious non sequitur -- Obviously reference classes work less well the fewer examples of the thing (and things similar to the thing) you have, but part of what's interesting about other things (e.g. deduction, gears-level models) is that they often work fine in completely novel cases. For example, the first human landing on the Moon was not achieved via reference classes, but via gears-level modelling.

I don't follow your critique of doing an intuitively-weighted average of outside view and some inside view. In particular, you say 'This is not Tetlock’s advice, nor is it the lesson from the forecasting tournaments...'. But in the  blog post section that you point to, you say 'Tetlock’s advice is to start with the outside view, and then adjust using the inside view.', which sounds like he is endorsing something very similar, or a superset of the thing you're citing him as disagreeing with?

Do you follow it in the case where "outside view" has the expansive, ambiguous new meaning I've complained about? I feel like it's pretty clear that's not what Tetlock meant.

In the case where "outside view" means reference classes, here are my complaints:

1. I don't think the "only use inside view if you feel like you are a qualified expert" bit is justified by Tetlock's advice. He tells everyone to adjust with inside view, and he has all this evidence about how experts suck when they use inside view.

2. "Have some outside views, and some inside views, and then aggregate them" is not the same thing as "start with an outside view, then adjust it, then get an entirely different perspective and repeat, then aggregate the perspectives." And note that even that is not exactly what he said IMO; my summary of his advice at the time was as follows:

Tetlock describes how superforecasters go about making their predictions.56 Here is an attempt at a summary:
1. Sometimes a question can be answered more rigorously if it is first “Fermi-ized,” i.e. broken down into sub-questions for which more rigorous methods can be applied.
2. Next, use the outside view on the sub-questions (and/or the main question, if possible). You may then adjust your estimates using other considerations (‘the inside view’), but do this cautiously.
3. Seek out other perspectives, both on the sub-questions and on how to Fermi-ize the main question. You can also generate other perspectives yourself.
4. Repeat steps 1 – 3 until you hit diminishing returns.
5. Your final prediction should be based on an aggregation of various models, reference classes, other experts, etc.

Notice how step 1 is a very inside-viewy sort of thing; you are reasoning about the structure of the thing, breaking it down into parts, etc. I imagine the reason to do this is that often it's possible to find good reference classes (or other rigorous ways to estimate) the parts, but not the whole. Perhaps there's also a motivation about making errors cancel out, idk. Anyhow my point is that this methodology, (which is a fuller and less lossy version of Tetlock's advice than the slogan "start with outside view, then adjust,") is importantly different from the "Have an outside view, and an inside view if you are an expert, and then aggregate them."

Curated. The overall point about jargon-creep/confusion/misuse of Outside/Inside Views seems important. The specific suggestion of specificity seems potentially valuable, but I've appreciated some of the comments giving pushback or alternate takes on what to do about it. I'm hoping this curation prompts some further discussion of how to approach this.

Great post. That Anakin meme is gold.

“Whenever you notice yourself saying ‘outside view’ or ‘inside view,’ imagine a tiny Daniel Kokotajlo hopping up and down on your shoulder chirping ‘Taboo outside view.’”

Somehow I know this will now happen automatically whenever I hear or read “outside view.” 😂

Love this post! So linear and so many examples made it so easy to read! Also I was vaguely annoyed at the term Outside View but didn’t know why or whether I was right or anything? This expansion of it into parts makes a lot of sense.

Earlier proposal for more precise terminology: Gears Level & Policy Level by Abram Demski.

I'm not sure if outside (ha!) the "rationalist sphere", other people have independently invented the phrase "outside view" or not but I feel there's some spillover of the term "outside view" with similarity to "outsiders' view" which I think is common enough in layperson speak.

An outsider's view (or third party view, attempt to be objective and look "outside" your current situation as an "other") as conceived of in daily life does have elements that are pointed at in this post for popular interpretations of outside view ("Bias correction, in others or in oneself", "Deference to wisdom of the many"). 

But it also could heavily involve the original meaning described and clarified in this post too of "Reference class forecasting" if outsiders can offer broader views by adding to the reference class

For example in an argument when two people are fighting over (thing) say a married couple bickering or two friends whose relations have soured due to some problem, the two parties with vested interests may think their struggle is unique and particular, but a neutral third party or outsider can often (though not necessarily) have a better, objective view because they've also seen enough different fights over (thing) that the two involved have not seen before. 

Bravo, this is on the meta level a great example of applying epistemic rationality to replace a vague concept with better concepts. The post uses specific examples everywhere to be clearly understandable and easy to apply. It could be part of my specificity sequence, with a title like “The Power to Clarify Concepts”. https://www.lesswrong.com/posts/XosKB3mkvmXMZ3fBQ/specificity-your-brain-s-superpower

Is there a reason it shouldn't be a part of that sequence?

This is great. Feels like a very good catch. Attempting to start a comment thread doing a post-mortem of why this happened and what measures might make this sort of clarity-losing definition drift happen less in the future.


One thing I am a bit surprised by is that the definition on the tag page for inside/outside view was very clearly the original definition, and included a link to the Wikipedia for reference class forecasting in the second sentence. This suggests that the drifted definition was probably not held as an explicit belief by a large number of highly involved LessWrongers. This in turn makes two different mechanisms seem most plausible to me:

  1. Maybe there was sort of a doublethink thing going on among experienced LW folks that made everyone use "outside view" differently in practice than how they explicitly would have defined it if asked. This would probably be mostly driven by status dynamics, and attempts to solve would just be a special case of trying to find ways to not create applause lights.
  2. Maybe the mistake was mainly among relatively new/inexperienced LW folks who tried to infer the definition from context rather than checking the tag page. In that case, attempts to solve would mostly look like increasing the legibility of discourse within LW to new/inexperienced readers, possibly by making the tag definition pages more clickable or just decreasing the proliferation of jargon.

The way I understand it is that 'outside view' is relative, and basically means 'relying on more reference class forecasting / less gears-level modelling than whatever the current topic of discussion is relying on'. So if we're discussing a gears-level model of how a computer chip works in the context of if we'll ever get a 10 OOM improvement in computing power, bringing up moore's law and general trends would be using an 'outside view'.

If we're talking about very broad trend extrapolation, then the inside view is already not very gears-level. So suppose someone says GWP is improving hyperbolically so we'll hit a singularity in the next century. An outside view correction to that would be 'well for x and y reasons we're very unlikely a priori to be living at the hinge of history so we should lower our credence in that trend extrapolation'. 

So someone bringing up broad priors or the anti-weirdness heruistic if we're talking about extrapolating trends would be moving to a 'more outside' view. Someone bringing up a trend when we're talking about a specific model would be using an 'outside view'. In each case, you're sort of zooming out to rely on a wider selection of (potentially less relevant) evidence than you were before.

 

Note that what I'm trying to do here isn't to counter your claim that the term isn't useful anymore but just to try and see what meaning the broad sense of the term might have, and this is the best I've come up with. Since what you mean by outside view shifts dependent on context, it's probably best to use the specific thing that you mean by it in each context, but there is still a unifying theme among the different ideas.

Reference class forecasting is k way regression, right.

One issue is that recent events - the pandemic, cryptocurrency - seem to just be "off the graph" events. You can try to use the "Spanish flu" as a predictor for the pandemic but it was so far away in time and world structure as to be useless. Cryptocurrency can be compared to the Tulip mania and other bubbles but again it's not the same.

We can't predict something then with this method if we don't have references.

Well, sorta. For my entire lifespan the science press is full of breathless optimism. A professor somewhere wrote a paper and got something to vaguely work. And thus flying cars and cyborgs or free energy is 5 minutes away!

Obviously nothing came out of any of that. The things that lead to progress had money - gigadollars - behind them. Like these white leds and the chip in the device I use to write this message and it's OLED screen and so on. And it took years and years and many generations of the tech past the breathless article stage - at least 20 years for OLED - to not suck.