I found this to be a much better thought out, better explained, better reasoned and just plain more fun than all but the first few chapters of the actual book. In my mind these are just better examples, and examples that Eliezer understands better and for which he can present better and more accurate evidence.
It's almost as if the book was written, using sufficiently modest examples and ludicrously charitable assumptions, so Eliezer would feel he has the right to say the things here, the things he actually means to say, and that this is the real point. That would help explain why (at least to me) the middle part of the book felt like it was flailing around and didn't make sense. It wasn't trying to. And so, at the right level of viewing, that too serves as an example of the problems at hand.
And of course, even then, this doesn't go far enough. You don't just need a hero liscence. You need an occupational liscence. You need a thinking liscence. You need a knowing anything at all liscence. And you need to not be caught not enforcing the liscensing agreements, at an arbitrary meta level, or to be caught being motivated by something other than such enforcemen...
Additional predition: it was more fun to write this than the book, and the writing involved an initial long contiguous chunk.
(Where it's coming from: my enjoyment of reading the above was mostly a little bit of thrill, of the kind I get from watching someone break rules that I always wished I could break. If that makes sense.)
Counterpoint: I got bored reading this about halfway through, and I my guess is that the sort of person who actually needs the concepts is more likely to close the browser indignantly.
I think Zvi's description of what's going on sounds accurate, but I think it was actually necessary to write InEq in (roughly) the manner it was written.
((Counter-counter-point, I assume dramatically less time was put into this, and that among other things it wasn't broken into chapter segments that made it more easily consumable, which is most of the reason
I got bored))
There is a need for a certain amount of repetition to make the point, but no question this has some amount of apology for writing a long letter because there wasn't time to write a shorter one. That extra length probably was worse because you'd already read the book, so you mostly knew where it was going.
I'm sad and somewhat scared that chapter segments make that much difference, but I can believe it, and I'm also happy since we can just use them. Good job being self-aware and pointing it out explicitly. Weird that it might be a big win to literally spend five minutes adding "I", "II", "III" etc as line breaks.
It is now the future, and I was rereading the post without remembering that there was ever a time when there were no chapter breaks, and I found the chapter breaks quite reassuring, giving me bits of "ah, I have made some progress!"
Huh, the chapters thing doesn't feel especially sad to me. Or, it's sad in the sense that it'd be better if all humans were immune to all disease and could fly and twice as smart, but given the general shape of the world and our brains, it seems reasonable to take into consideration...
1) working memory
2) how much time people have to read a thing at a time
3) expectations about how long a given thing will take
Like, if I start reading a blogpost, I expect it to be blogpost length, and that's how much time I'm budgeting for it. If it's longer than blogpost length, then once I run out of time I start thinking "okay, is this worth putting in the effort to remember to read later, and to remember where I was when I left off, or to postpone whatever I was going to do next instead of finishing it?"
Designing a post to accomodate seems quite reasonable to me.
(See also Scott's post on writing nonfiction)
http://slatestarcodex.com/2016/02/20/writing-advice/
((Again, assuming this is all stuff Eliezer knows and generally takes into account, and that the issue here was that this post was coming out of the "off hours bucket"))
A thing that feels sad to me (in the vein of chapter sections being necessary): the fact that the majority of all long form content that I and anybody I know takes in, is SSC, WaitButWhy, and John Oliver - the only stuff where the hedonic hit rate is at least once per paragraph/breath. I can't remember the last time I read a book (yes I can, it was 4 months ago when I read InEq in a single sitting. But I really can't remember before that.).
I get the sense that in the past many people read books, and now I know very few people who read whole books. I remember as a teenager I used to read many non-fiction books. I have maybe read one a year for the past three years.
Quick Googling says that your experience was below average as of a few years ago, with the median person reading six books, 72 percent reading at least one, 60 percent reading at least one fiction and 60 percent reading at least one non-fiction, so one per year is well below the implied mean. So this seems like a 'local' problem. I'd definitely call it a problem, and consider myself to be reading far fewer full books than I should versus too much other stuff (and to not be writing enough reviews even of the ones I do read).
Although 'keep the tab open until later even on a phone' seems perfectly reasonable
Dunno. I regularly just close my browser completely.
I'm super pro long-form stuff, but longform stuff still should be optimized for being longform. A 10k blogpost without sections suffers from even being able to find/trust where you left off, even if you're just looking for comments)
I honestly felt like it got better as the post went on. The middle was the most boring part, though, even as someone who enjoted it.
Having actually read the whole thing now:
I feel like this was doing a fairly different thing than InEq itself (or at least, the effect it had on me was pretty different)
This post mostly made me think about status dynamics, both within myself, Eliezer, and other people. I think this was useful to think about, but as Eliezer notes somewhere in the middle of this: it feels like this is focusing our attention disproportionately on the wrong half of the equation. (I don't think we should think zero about the status thing, but it seems like it should be something like a 70/30 ratio of "thinking about system behaviors we can change" and "thinking about how status in particular works".)
Whereas InEq focused my attention on "How do I actually notice when situations are likely to be inadequate? When can I reasonably expect outsized success for my effort?"
I think of this as a call to temporarily focus on that part of the equation so one can realize that this is what most modesty arguments are actually motivated by, and that such arguments are not useful in dealing with the object level so you should ignore them in favor of examining the object level directly. It's a call to stop paying attention to status arguments (including when they take the form of modesty arguments) so you can see the object level.
Then there's the mostly seperate question of using status and social dynamics as tools to analyze the failures of systems, which I agree is a useful tool that shouldn't take up that much of our time.
Datapoint: Reading InEq changed my thinking habits when evaluating projects much more than this essay did. This essay mostly reminded me to be surprised at the success of HPMOR, and think about all the useful cognitive updates Pat Modesto would make if he were to fully update on the datapoint of HPMOR.
Okay, I was kinda bored while reading this, but after reading it I asked myself how much modest epistomology I used in my life. I realized I wasn't even at the level of ignoring my immodest inside-view estimates —I wasn't generating them!
I'm now in the process of seriously evaluating the success chances of the creative ideas I've had over the years, which I'm realizing I never actually did. I put real (though hobby-level) work into one once, and I've long regarded quitting my day job someday as "a serious possibility", but I just felt not allowed to generate an honest answer to "how likely would this be to succeed".
And guess what, this evaluation shows I'm an idiot for keeping my ideas on the back burner as much as I have.
One important piece of information that Pat doesn’t seem to be registering is the part where Eliezer is already working on the project and feels that it’s going well. This added fact contains a lot of important information and Pat glosses over it.
If a physicist friend tells me they think they could probably solve [longstanding important physics problem] if they tried, I will be about as skeptical as if they claimed that they could do some other arbitrary difficult thing. If they tell me they have the solution to [longstanding important physics problem] clearly in mind, they’ve been working on developing the details and that work is going well, and they’re 10% confident that they will succeed, then I’ll probably update pretty far towards their own estimate.
I agree with this. There are a lot of little clues like this pointing in one direction or the other. I think a lot of people have a learned helplessness about noticing and combating biases (both in themselves, and in other people), and that this results in people under-updating on local evidence that someone is more/less likely to be dissembling or deceiving themselves.
I'm dealing with cognitive processes in my environment that are frequently adversarial and trying to "cheat" (if only unconsciously), but I nonetheless have to be able to update on evidence against rationalization just as readily as evidence for rationalization. (And in my case, I have to try to consciously correct for tendencies like "I'm more likely to be sensitive to evidence of cheating, and insensitive to evidence of non-cheating, when the person claiming to be able to outperform feels low-status to me, or when it feels like they're doing something socially risky.")
This added fact contains a lot of important information and Pat glosses over it.
Ah, but this fact isn't third-party visible, which is one of the requirements Pat has before they'll accept it as valid evidence. (Since, you know, anyone can say that they feel it's going well, right? What makes you feel especially confident about your own judgment as to whether or not a given project is going well?)
Hmm...
Looking at a dialogue like this one, I am tempted to try and steelman the mental motion that that wants to defend the status hierarchy from upstarts. Like, when I step into Pat or Maude's mental shoes, what do I want to defend?How are they right?
(This seems like exactly the sort of situation where they're action is attempting to avoid a bucket error. Can I disentangle the buckets and get the value of what Eliezer is pushing for while defending the thing that matters to Pat and Maude?)
When I do that I'm reminded of an unusual meetup I attended some years ago, hosted by a rationalist adjacent meditation-expert. Most of the attendees were rationalists, but one person was a newbie, a new-agey type that the host had met on okcupid. I only talked with them a little, but I got the sense that they maybe believed in the literal physical reality of psychic energy and whatnot.
One of the rationalist (a academic himself) was talking about the ways science is broken: p-hacking, and replication crises, and misaligned incentives. I thought all of this was correct, but I was also concerned about the impact it was likely to have on the new-agey person. I was concerned that they would take th...
So... Longtime lurker, made an account to comment, etc.
I have a few questions.
First two, about innate status sense:
* I'm not convinced this it exists; is there a particular experiment (thought or otherwise) that could clearly demonstrate the existence of innate status sense among people? Presuming I don't have it, and I have several willing, honest, introspective, non-rationalist, average adults, what could I ask them?
* Is there a particular thought experiment I could perform that discriminates cleanly between worlds in which I have it and worlds in which I don't?
Next, about increasing probability estimates of unlikely events based on the outside view:
* This post argues against "Probing the Improbable" and for "Pascal's Muggle: Infinitesimal ..."; having skimmed the former and read the latter, I'm not clearly seeing the difference. Both seem to suggest that after using a model, implicitly or explicitly, to assign a low probability to an event, it is important to note the possibility that the model is catastrophically wrong and factor that into your instrumental probability.
It seems like the correct answer to the question "how come you think you're good enough to do this?" is just silence, right?
Later on, you'll either succeed or fail. If you succeed, you were clearly up to the task. If you fail, it will often be convenient to have not boasted earlier. (But it's still not a big deal if you did boast; after all, it's not a sin to make an incorrect prediction.)
Like, who has the authority to say "thou shalt not try things that might fail"? As long as you're not conning anybody out of resources, your failure doesn't pick anybody else's pocket.
It seems like writing this whole post is an exercise in arguing that you do deserve people's good opinion, respect, etc, even if you're not modest. But honestly, it's not fair to demand that people think well of you. People can think whatever they want. And their good opinion can't really nourish you emotionally that well in the first place, in my experience. Validation is just...unusable, in the same way you can't digest cellulose.
I think you're right in general.
However in specific unusual situations silence might not work... like if you're talking to potential investors (or philanthropists) and they ask "How come you think you're good enough to do this [thing that you want us to partially fund]?"
If I understand correctly, Eliezer decided at a young age to work on a public good whose value would be difficult (or evil) to reserve only to those who helped pay to bring it about, and which was unintelligible to voters, congress critters, the vast majority of philanthropists, and even to most of the high prestige experts in the relevant technical fields.
Having tracked much of his online oeuvre for approaching two decades, I say that arguably his biggest life accomplishment has been the construction of an entire subcultural ecosystem wherein the thing he aspired to spend his life on (ie building Friendly AGI) is basically validated as worth donating to.
There is still the question of whether the existence of such a culture is necessary or sufficient to actually be safe from "unaligned AGI" or "grey goo" or various other scary things (because at some point the rubber wi...
In particular, I think the key distinction is between "I demand you justify yourself to me" and "I would appreciate if you could help satisfy my curiosity". Even if the person is a potential investor it's best to decline to jump through hoops and wait for them to shift to genuine curiosity.
If someone asks "how come you think you're good enough to do this?", I generally interpret this as "You seem to be implying that I should see you as high status. If you are going to demand I see you as high status then I counter-demand that you back it up. If you don't back up your active bids for status, I will conclude that you're a faker and declare you to be low status". The correct response to this is to not try to control your status in their mind in the first place. In response to this question, I'd probably go with something like "I might not be. I don't know." and emphasizing that this is a very real thing to me. Showing agreement and real weight to the idea that you might not deserve the status claim they see implied is something that you can't do if you're trying to make a grab for the status, so ...
Like, who has the authority to say "thou shalt not try things that might fail"? As long as you're not conning anybody out of resources, your failure doesn't pick anybody else's pocket.
What about altruistic reasons for asking? If my friend is planning to quit their job and become a famous musician I would probably attempt strongly to dissuade them, even if it wouldn't directly affect me.
If however I thought they were likely to succeed (e.g. have made money selling music on bandcamp and performing, in talks with a record company, etc.) I probably wouldn't dissuade them.
Wow. This was great. I really liked sitting and thinking through the update that Pat would/should make on observing the in-fact outcome of HPMOR. I have in my head the feelings assocciated with generating Pat's arguments, and I have in my head the notion of 'keeping your eye on the ball' - looking at all the evidence and models of the thing you have - and realising that HPMOR worked, helps me flag the feels of the first notion as 'unhelpful', and direct my thoughts towards the object level evidence (I also expect it's a really useful example for many people in this community - especially those who were around before HPMOR succeeded). I greatly appreciate having a bunch of these fully-general counterarguments flagged in my head as 'wasted cognition'.
This line was super helpful, in being able to pass the ITT of Pat and Maude:
The modest view, roughly, is that the world is inexploitable as far as you can predict, because you can never knowably know better than the experts.
And also:
A voice like Pat Modesto is not a productive voice to have inside your head, in my opinion, so I don’t spontaneously wonder what he would say.
In the past I've not...
I'm not sure that this is helpful or not, or whether you're aware, but a Harry Potter fic called Rebuilding by Colubrina has more reviews than Harry Potter and the Methods of Rationality. That fic wasn't written in 2014, when the above dialogue was written, and I don't know when it overtook HPMOR, but "most reviewed" gets used somewhat frequently in this dialogue, and I've seen that metric used a number of times elsewhere,
While reading this, a thought popped into my head that feels important enough to share:
Could being "status-blind" in the sense that Eliezer claims to be (or perhaps somet other not yet well-understood status-related property) be strongly correlated to managing to create lots of utility? (In the sense of helping the world a lot).
Currenlty I consider Yudkowsky, Scott Alexander, and Nick Bostrom to be three of the most important people. After reading superintelligence and watching a bunch of interviews, one of first things I said about Nick Bostrom to a friend was that I felt like he legitimately has almost no status concerns (that was well before LW 2.0 launched). In case of S/A it's less clear, but I suspect similar things.
Yes. I think that being status blind is a huge advantage in having original thoughts and doing original work, and it was and is a huge help to those three in particular and to a number of others in the community. I used to be quite status blind, I'm at least less status blind now, and it's certainly brought huge benefits but I notice the extra difficulty in noticing and solving object-level things, and I'm sad about it. Ideally one is able to recapture that blindness when one needs it, on the levels one needs it and only those levels, but man is that hard.
There's at least one person I think is still getting big benefits from it, and the last thing I would do is make them aware of it!
I don't think "status blind", the thing that leads to thoughts like "I don't get how/why people keep bringing status into this instead of just looking at the arguments", is what it is made out to be.
I used to be that kind of person. Back in college, my thesis adviser told me that he liked that I was willing to tell him when he was wrong, and that most of his students wouldn't do that (that was the upside. there were downsides too). It completely blew my mind because I literally couldn't grasp how it could be any other way. When I would point out that he was wrong, it wasn't a sneaky way of saying "I'm smarter and higher status than you", it was about the physics and status games just wasn't the level we were speaking on.
However that doesn't mean that "status" wasn't a useful concept for describing how I interacted with people in general or that it wasn't important to my interactions with him, even. It just means that the status requirements were satisfied , so that we could get to the good stuff. He saw himself as someone who was "smart, but not above making mistakes" and saw his stu...
Yeah, a better way of gesturing at what Eliezer means by "status-blind" might be "doesn't reflexively assign status or deference to people based on a felt sense of how authoritative/respectable/impressive people are likely to view them as being."
As a "status-sighted" person, I don't think the difference feels internally like a distinct "emotion"; it feels more like people's impressiveness is just baked into the world as an obvious, perceptible fact. It's just goddamn different to meet a senator in full regalia versus meet the head of a local anarchist group.
If I weren't deliberately trying to consider counterfactuals, though, in the moment I don't think I'd ever consciously register that I'm unintentionally treating the senator differently because they just feel high-status to me. I might notice an isolated deferential act and rationalize it as a useful strategy, but that's very different.
Indeed, I think that's another factor that makes it really hard for me to notice when my behavior is authoritativeness-influenced—it doesn't feel subjectively distinct from when part of my mind is quietl...
>Yeah, a better way at gesturing at what Eliezer means by "status-blind" might be "doesn't reflexively assign status or deference to people based on a felt sense of how authoritative/respectable/impressive people are likely to view them as being."
Yes, but I think the difference is in "how people are likely to view them" vs "how I see them", and not in "doesn't reflexively assign status or deference".
>As a "status-sighted" person, I don't think the difference feels internally like a distinct "emotion"; it feels more like people's impressiveness is just baked into the world as an obvious, perceptible fact. It's just goddamn different to meet a senator in full regalia versus meet the head of a local anarchist group.
This is what I meant when I said that status is invisible when agreed upon. In "competent elites", Eliezer wrote that he was expecting to find "fools in business suits" and was shocked that these people were "visibly much smarter than average mortals" and felt "more alive", even. This was two years before the "status blind" ...
Generally, when I ask the sort of questions Pat Modesto does it’s because I want to see proof that someone has really thought a course of action through. It’s not about status, I’m asking to see your hero license because I want a costly signal that you’re not full of shit. In your post you write the following in argument against:
stranger: Wrong. Your model of heroic status is that it ought to be a reward for heroic service to the tribe. You think that while of course we should discourage people from claiming this heroic status without having yet served the tribe, no one should find it intuitively objectionable to merely try to serve the tribe, as long as they’re careful to disclaim that they haven’t yet served it and don’t claim that they already deserve the relevant status boost.
Actually, this is in fact largely how it works as far as I know. However, helping costs resources. So any time I have to consider whether to help you I need to evaluate your chances of success. Without knowing very much about your personality, which is very different from your goals, I have no way of knowing whether you’re the type to bullshit yourself. That’s why having a variant of Pat Modesto in your he...
That was surprisingly good. I've never let my inner Pat Modesto be the boss, but I've never tried to kick them out either. This makes me consider whether I should. Which is a lot more than I get out of most of Eliezer's writing.
And here's what kicking Pat out would let me say: I think that I've designed at least 5 voting methods that are each the best solution currently in the world to the problem it solves, and at least 3 of those problems are at least 25% likely to be adequately-posed (including pragmatic considerations) and important (fixing would be roughly order of one-off value of $1e12, with SD 1 in the exponent). I think that if you find this sufficiently plausible you should contact me.
Having Pat in your head doesn't feel like a separately identifyable agent process that I can just discard, it just feels like typical hyper-self-awareness where you think of these status trespassers Eliezer-2010 as cornball or cringey like a Kanye West figure. Status regulation is an important part of my personality it makes me seem more authentic and likeable to others and it's a trait I find attractive in others.
I would also add that a big part of the Pat problem is motivation/energy. It's more comfortable to believe the Pat narrative, it requires less work. I too find his talk of the "outside view" tedious though and I don't particularly believe him to be right. Hopefully by identifying him, something can be done.
Looks like you follow this method of making decisions:
1) Focus on important things.
2) If something is working unusually well for you, and it could be important, disregard (1) and do it.
But many successful people get by with a simpler version:
1) If something is working unusually well for you, do it!
The hard part for most people is finding something that works unusually well for them. It's probably unique for each person, and finding it is a more intuitive than rational process. But I certainly agree with the book's message that if you've found something like that, you shouldn't stop yourself from doing it!
I feel like after reading this I have a much better insight into how Eliezer thinks than I did before, even having read most of his published work.
I think his model of other people is off though.
Specifically, he uses ideas of comparative status to explain other people not challenging conventional wisdom, or trying new things a lot. Which feels like it could be a fully general argument for any observed behaviour (e.g. it could equally well explain a habit of disproportionately challenging experts, as being in conflict with them puts you at their level a...
Pat and Maude's arguments seem somewhat more reasonable if they're essentially saying "if you're so smart, why aren't you high-status?" Since nearly everyone (including many people who explicitly claim not to) places a high value on status at least within some social sphere, and status is instrumentally useful for so many goals even if you don't value it terminally, a human can be assumed to already be trying as hard as they can to increase their status, and thus it's a decent predictive proxy for a their general ab...
I think the arguments here apply much better to the AGI alignment case than to the case of HPMOR. The structure of the post suggests (? not sure) that HPMOR is meant to be the "easier" case, the one in which the reader will assent to the arguments more readily, but it didn't work that way on me.
In both cases, we have some sort of metric for what it would mean to succeed, and (perhaps competing) inside- and outside-view arguments for how highly we should expect to score on that metric. (More precisely, what probabilities we should assign to...
You spend a lot of time arguing "immodest" over "modest" epistemology here, when the thing that really gets me is the hedgehog vs. fox epistemology. I kept wanting to say yes, Pat's right that outside view should lower the probability in certain situations, and yes, Eliezer's right that the inside view should raise his probability of beating the odds in certain situations, and that they should look at this individual situation to see how much each should apply.
I wanted to say that YES, Eliezer's view that outside view ar...
I don't see it. Maybe you think fox epistemology wouldn't donate to MIRI, which is presumably what Eliezer cares about? But what he claims repeatedly is that we should judge situations just as you say, and he offers a way to do this.
Is Pat poorly calibrated though? I don't think he is. I don't see anything in the text to suggest otherwise. If you're going to criticize Pat's decision process, I would hope the first argument to be "It doesn't work well". If it does, in fact, work, then maybe you're the one with flawed reasoning.
The arguments why Aumann's agreement theorem doesn't apply need a lot more work. It's a pretty big claim.
The whole "hero license" thing is indeed ad hominem. Pat is not demanding to see your hero li...
>I have never yet seen an informal conjunctive breakdown of an allegedly low probability in which the final conclusion actually required every one of the premises.
How about the Drake equation?
It is ambiguous whether it allows panspermia, but I think it holds up pretty well.
One thing I think is probably true: unless you're unusually competent or unusually driven, you should look for areas where you can "charge ahead with reckless abandon", and you should try to cultivate a sense of quality in those areas, possibly by transfer learning; then, having done so, you can exploit your ability to charge ahead in that direction to effectively build skill. In this model, directions to charge in are not something you can reasonably choose, but rather become available to you through chance. If this holds, not everyone can ...
I expect most readers to know me either as MIRI's co-founder and the originator of a number of the early research problems in AI alignment, or as the author of Harry Potter and the Methods of Rationality, a popular work of Harry Potter fanfiction. I’ve described how I apply concepts in Inadequate Equilibria to various decisions in my personal life, and some readers may be wondering how I see these tying in to my AI work and my fiction-writing. And I do think these serve as useful case studies in inadequacy, exploitability, and modesty.
As a supplement to Inadequate Equilibria, then, the following is a dialogue that never took place—largely written in 2014, and revised and posted online in 2017.
i. Outperforming and the outside view
(The year is 2010. eliezer-2010 is sitting in a nonexistent park in Redwood City, California, working on his laptop. A person walks up to him.)
person: Pardon me, but are you Eliezer Yudkowsky?
eliezer-2010: I have that dubious honor.
person: My name is Pat; Pat Modesto. We haven’t met, but I know you from your writing online. What are you doing with your life these days?
eliezer-2010: I’m trying to write a nonfiction book on rationality. The blog posts I wrote on Overcoming Bias—I mean Less Wrong—aren’t very compact or edited, and while they had some impact, it seems like a book on rationality could reach a wider audience and have a greater impact.
pat: Sounds like an interesting project! Do you mind if I peek in on your screen and—
eliezer: (shielding the screen) —Yes, I mind.
pat: Sorry. Um... I did catch a glimpse and that didn’t look like a nonfiction book on rationality to me.
eliezer: Yes, well, work on that book was going very slowly, so I decided to try to write something else in my off hours, just to see if my general writing speed was slowing down to molasses or if it was this particular book that was the problem.
pat: It looked, in fact, like Harry Potter fanfiction. Like, I’m pretty sure I saw the words “Harry” and “Hermione” in configurations not originally written by J. K. Rowling.
eliezer: Yes, and I currently seem to be writing it very quickly. And it doesn’t seem to use up mental energy the way my regular writing does, either.
(A mysterious masked stranger, watching this exchange, sighs wistfully.)
eliezer: Now I’ve just got to figure out why my main book-writing project is going so much slower and taking vastly more energy... There are so many books I could write, if I could just write everything as fast as I’m writing this...
pat: Excuse me if this is a silly question. I don’t mean to say that Harry Potter fanfiction is bad—in fact I’ve read quite a bit of it myself—but as I understand it, according to your basic philosophy the world is currently on fire and needs to be put out. Now given that this is true, why are you writing Harry Potter fanfiction, rather than doing something else?
eliezer: I am doing something else. I’m writing a nonfiction rationality book. This is just in my off hours.
pat: Okay, but I’m asking why you are doing this particular thing in your off hours.
eliezer: Because my life is limited by mental energy far more than by time. I can currently produce this work very cheaply, so I’m producing more of it.
pat: What I’m trying to ask is why, even given that you can write Harry Potter fanfiction very cheaply, you are writing Harry Potter fanfiction. Unless it really is true that the only reason is that you need to observe yourself writing quickly in order to understand the way of quick writing, in which case I’d ask what probability you assign to learning that successfully. I’m skeptical that this is really the best way of using your off hours.
eliezer: I’m skeptical that you have correctly understood the concept of “off hours.” There’s a reason they exist, and the reason isn’t just that humans are lazy. I admit that Anna Salamon and Luke Muehlhauser don’t require off hours, but I don’t think they are, technically speaking, “humans.”
(The Mysterious Masked Stranger speaks for the first time.)
stranger: Excuse me.
eliezer: Who are you?
stranger: No one of consequence.
pat: And why are you wearing a mask?
stranger: Well, I’m definitely not a version of Eliezer from 2014 who’s secretly visiting the past, if that’s what you’re thinking.
pat: It’s fair to say that’s not what I’m thinking.
stranger: Pat and Eliezer-2010, I think the two of you are having some trouble communicating. The two of you actually disagree much more than you think.
pat & eliezer: Go on.
stranger: If you ask Eliezer of February 2010 why he’s writing Harry Potter and the Methods of Rationality, he will, indeed, respond in terms of how he expects writing Methods to positively impact his attempt to write The Art of Rationality, his attempt at a nonfiction how-to book. This is because we have—I mean, Eliezer has—a heuristic of planning on the mainline, which means that his primary justification for anything will be phrased in terms of how it positively contributes to a “normal” future timeline, not low-probability side-scenarios.
eliezer: Sure.
pat: Wait, isn’t your whole life—
eliezer: No.
stranger: Eliezer-2010 also has a heuristic that might be described as “never try to do anything unless you have a chance of advancing the Pareto frontier of the category.” In other words, if he’s expecting that some other work will be strictly better than his along all dimensions, it won’t occur to Eliezer-2010 that this is something he should spend time on. Eliezer-2010 thinks he has the potential to do things that advance Pareto frontiers, so why would he consider a project that wasn’t trying? So, off-hours or not, Eliezer wouldn’t be working on this story if he thought it would be strictly dominated along every dimension by any other work of fanfiction, or indeed, any other book.
pat: Um—
eliezer: I wouldn’t put it in exactly those terms.
stranger: Yes, because when you say things like that out loud, people start saying the word “arrogance” a lot, and you don’t fully understand the reasons. So you’ll cleverly dance around the words and try to avoid that branch of possible conversation.
pat: Is that true?
eliezer: It sounds to me like the Masked Stranger is trying to use the Barnum effect—like, most people would acknowledge that as a secret description of themselves if you asked them.
pat: ...... I really, really don’t think so.
eliezer: I’d be surprised if it were less than 10% of the population, seriously.
stranger: Eliezer, you’ll have a somewhat better understanding of human status emotions in 4 years. Though you’ll still only go there when you have a point to make that can’t be made any other way, which in turn will be unfortunately often as modest epistemology norms propagate through your community. But anyway, Pat, the fact that Eliezer-2010 has spent any significant amount of time on Harry Potter and the Methods of Rationality indeed lets you infer that Eliezer-2010 thinks Methods has a chance of being outstanding along some key dimension that interests him—of advancing the frontiers of what has ever been done—although he might hesitate to tell you that before he’s actually done it.
eliezer: Okay, yes, that’s true. I’m unhappy with the treatment of supposedly “intelligent” and/or “rational” characters in fiction and I want to see it done right just once, even if I have to write the story myself. I have an explicit thesis about what’s being done wrong and how to do it better, and if this were not the case then the prospect of writing Methods would not interest me as much.
stranger: (aside) There’s so much civilizational inadequacy in our worldview that we hardly even notice when we invoke it. Not that this is an alarming sign, since, as it happens, we do live in an inadequate civilization.
eliezer: (continuing to Pat) However, the reason I hold back from saying in advance what Methods might accomplish isn’t just modesty. I’m genuinely unsure that I can make Methods be what I think it can be. I don’t want to promise more than I can deliver. And since one should first plan along the mainline, if investigating the conditions under which I can write quickly weren’t a sufficiently important reason, I wouldn’t be doing this.
stranger: (aside) I have some doubts about that alleged justification in retrospect, though it wasn’t stupid.
pat: Can you say more about how you think your Harry Potter story will have outstandingly “intelligent” characters?
eliezer: I’d rather not? As a matter of literature, I should show, not tell, my thesis. Obviously it’s not that I think that my characters are going to learn fifty-seven languages because they’re super-smart. I think most attempts to create “intelligent characters” focus on surface qualities, like how many languages someone has learned, or they focus on stereotypical surface features the author has seen in other “genius” characters, like a feeling of alienation. If it’s a movie, the character talks with a British accent. It doesn’t seem like most such authors are aware of Vinge’s reasoning for why it should be hard to write a character that is smarter than the author. Like, if you know exactly where an excellent chessplayer would move on a chessboard, you must be at least that good at playing chess yourself, because you could always just make that move. For exactly the same reason, it’s hard to write a character that’s more rational than the author.
I don’t think the concept of “intelligence” or “rationality” that’s being used in typical literature has anything to do with discerning good choices or making good predictions. I don’t think there is a standard literary concept for characters who excel at cognitive optimization, distinct from characters who just win because they have a magic sword in their brains. And I don’t think most authors of “genius” characters respect their supposed geniuses enough to really put themselves in their shoes—to really feel what their inner lives would be like, and think beyond the first cliche that comes to mind. The author still sets themselves above the “genius,” gives the genius some kind of obvious stupidity that lets the author maintain emotional distance...
stranger: (aside) Most writers have a hard time conceptualizing a character who's genuinely smarter than the author; most futurists have a hard time conceptualizing genuinely smarter-than-human AI; and indeed, people often neglect the hypothesis that particularly smart human beings will have already taken into account all the factors that they consider obvious. But with respect to sufficiently competent individuals making decisions that they can make on their own cognizance—as opposed to any larger bureaucracy or committee, or the collective behavior of a field—it is often appropriate to ask if they might be smarter than you think, or have better justifications than are obvious to you.
pat: Okay, but supposing you can write a book with intelligent characters, how does that help save the world, exactly?
eliezer: Why are you focusing on the word “intelligence” instead of “rationality”? But to answer your question, nonfiction writing conveys facts; fiction writing conveys experiences. I’m worried that my previous two years of nonfiction blogging haven’t produced nearly enough transfer of real cognitive skills. The hope is that writing about the inner experience of someone trying to be rational will convey things that I can’t easily convey with nonfiction blog posts.
stranger: (laughs)
eliezer: What is it, Masked Stranger?
stranger: Just... you’re so very modest.
eliezer: You’re saying this to me?
stranger: It’s sort of obvious from where I live now. So very careful not to say what you really hope Harry Potter and the Methods of Rationality will do, because you know people like Pat won’t believe it and can’t be persuaded to believe it.
pat: This guy is weird.
eliezer: (shrugging) A lot of people are.
pat: Let’s ignore him. So you’re presently investing a lot of hours—
eliezer: But surprisingly little mental energy.
stranger: Where I come from, we would say that you’re investing surprisingly few spoons.
pat: —but still a lot of hours, into crafting a Harry Potter story with, you hope, exceptionally rational characters. Which will cause some of your readers to absorb the experience of being rational. Which you think eventually ends up important to saving the world.
eliezer: Mm, more or less.
pat: What do you think the outside view would say about—
eliezer: Actually, I think I’m about out of time for today. (Starts to close his laptop.)
stranger: Wait. Please stick around. Can you take my word that it’s important?
eliezer: ...all right. I suppose I don’t have very much experience with listening to Masked Strangers, so I’ll try that and see what happens.
pat: What did I say wrong?
stranger: You said that the conversation would never go anywhere helpful.
eliezer: I wouldn’t go that far. It’s true that in my experience, though, people who use the phrase “outside view” usually don’t offer advice that I think is true, and the conversations take up a lot of mental energy—spoons, you called them? But since I’m taking the Masked Stranger’s word on things and trying to continue, fine. What do you think the outside view has to say about the Methods of Rationality project?
pat: Well, I was just going to ask you to consider what the average story with a rational character in it accomplishes in the way of skill transfer to readers.
eliezer: I’m not trying to write an average story. The whole point is that I think the average story with a “rational” character is screwed up.
pat: So you think that your characters will be truly rational. But maybe those authors also think their characters are rational—
eliezer: (in a whisper to the Masked Stranger) Can I exit this conversation?
stranger: No. Seriously, it’s important.
eliezer: Fine. Pat, your presumption is wrong. These hypothetical authors making a huge effort to craft rational characters don’t actually exist. They don’t realize that it should take an effort to craft rational characters; they’re just regurgitating cliches about Straw Vulcans with very little self-perceived mental effort.
stranger: Or as I would phrase it: This is not one of the places where our civilization puts in enough effort that we should expect adequacy.
pat: Look, I don’t dispute that you can probably write characters more rational than those of the average author; I just think it’s important to remember, on each occasion, that being wrong feels just like being right.
stranger: Eliezer, please tell him what you actually think of that remark.
eliezer: You do not remember on each occasion that “being wrong feels just like being right.” You remember it on highly selective occasions where you are motivated to be skeptical of someone else. This feels just like remembering it on every relevant occasion, since, after all, every time you felt like you ought to think of it, you did. You just used a fully general counterargument, and the problem with arguments like that is that they provide no Bayesian discrimination between occasions where we are wrong and occasions where we are right. Like “but I have faith,” “being wrong feels just like being right” is as easy to say on occasions when someone is right as on occasions when they are wrong.
stranger: There is a stage of cognitive practice where people should meditate on how the map is not the territory, especially if it’s never before occurred to them that what feels like the universe of their immersion is actually their brain’s reconstructed map of the true universe. It’s just that Eliezer went through that phase while reading S. I. Hayakawa’s Language in Thought and Action at age eleven or so. Once that lesson is fully absorbed internally, invoking the map-territory distinction as a push against ideas you don’t like is (fully general) motivated skepticism.
pat: Leaving that aside, there’s this research showing that there’s a very useful technique called “reference class forecasting”—
eliezer: I am aware of this.
pat: And I’m wondering what reference class forecasting would say about your attempt to do good in the world via writing Harry Potter fanfiction.
eliezer: (to the Masked Stranger) Please can I run away?
stranger: No.
eliezer: (sighing) Okay, to take the question seriously as more than generic skepticism: If I think of the books which I regard as having well-done rational characters, their track record isn’t bad. A. E. van Vogt’s The World of Null-A was an inspiration to me as a kid. Null-A didn’t just teach me the phrase “the map is not the territory”; it was where I got the idea that people employing rationality techniques ought to be awesome and if they weren’t awesome that meant they were doing something wrong. There are a heck of a lot of scientists and engineers out there who were inspired by reading one of Robert A. Heinlein’s hymns in praise of science and engineering—yes, I know Heinlein had problems, but the fact remains.
stranger: I wonder what smart kids who grew up reading Harry Potter and the Methods of Rationality as twelve-year-olds will be like as adults...
pat: But surely van Vogt’s Null-A books are an exceptional case of books with rationalist characters. My first question is, what reason do you have to believe you can do that? And my second question is, even given that you write a rational character as inspiring as a character in a Heinlein novel, how much impact do you think one character like that has on an average reader, and how many people do you think will read your Harry Potter fanfiction in the best case?
eliezer: To be honest, it feels to me like you’re asking the wrong questions. Like, it would never occur to me to ask any of the questions you’re asking now, in the course of setting out to write Methods.
stranger: (aside) That’s true, by the way. None of these questions ever crossed my mind in the original timeline. I’m only asking them now because I’m writing the character of Pat Modesto. A voice like Pat Modesto is not a productive voice to have inside your head, in my opinion, so I don’t spontaneously wonder what he would say.
eliezer: To produce the best novel I can, it makes sense for me to ask what other authors were doing wrong with their rational characters, and what A. E. van Vogt was doing right. I don’t see how it makes sense for me to be nervous about whether I can do better than A. E. van Vogt, who had no better source to work with than Alfred Korzybski, decades before Daniel Kahneman was born. I mean, to be honest about what I’m really thinking: So far as I’m concerned, I’m already walking outside whatever so-called reference class you’re inevitably going to put me in—
pat: What?! What the heck does it mean to “walk outside” a reference class?
eliezer: —which doesn’t guarantee that I’ll succeed, because being outside of a reference class isn’t the same as being better than it. It means that I don’t draw conclusions from the reference class to myself. It means that I try, and see what happens.
pat: You think you’re just automatically better than every other author who’s ever tried to write rational characters?
eliezer: No! Look, thinking things like that is just not how the inside of my head is organized. There’s just the book I have in my head and the question of whether I can translate that image into reality. My mental world is about the book, not about me.
pat: But if the book you have in your head implies that you can do things at a very high percentile level, relative to the average fiction author, then it seems reasonable for me to ask why you already think you occupy that percentile.
stranger: Let me try and push things a bit further. Eliezer-2010, suppose I told you that as of the start of 2014, Methods succeeded to the following level. First, it has roughly half a million words, but you’re not finished writing it—
eliezer: Damn. That’s disappointing. I must have slowed down a lot, and definitely haven’t mastered the secret of whatever speed-writing I’m doing right now. I wonder what went wrong? Actually, why am I hypothetically continuing to write this book instead of giving up?
stranger: Because it’s the most reviewed work of Harry Potter fanfiction out of more than 500,000 stories on fanfiction.net, has organized fandoms in many universities and colleges, has received at least 15,000,000 page views on what is no longer the main referenced site, has been turned by fans into an audiobook via an organized project into which you yourself put zero effort, has been translated by fans into many languages, is famous among the Caltech/MIT crowd, has its own daily-trafficked subreddit with 6,000 subscribers, is often cited as the most famous or the most popular work of Harry Potter fanfiction, is considered by a noticeable fraction of its readers to be literally the best book they have ever read, and on at least one occasion inspired an International Mathematical Olympiad gold medalist to join the alliance and come to multiple math workshops at MIRI.
eliezer: I like this scenario. It is weird, and I like weird. I would derive endless pleasure from inflicting this state of affairs on reality and forcing people to come to terms with it.
stranger: Anyway, what probability would you assign to things going at least that well?
eliezer: Hm... let me think. Obviously this exact scenario is improbable, because conjunctive. But if we partition outcomes according to whether they rank at least this high or better in my utility function, and ask how much probability mass I put into outcomes like that, then I think it’s around 10%. That is, a success like this would come in at around the 90th percentile of my hopes.
pat: (incoherent noises)
eliezer: Oh. Oops. I forgot you were there.
pat: 90th percentile?! You mean you seriously think there’s a 1 in 10 chance that might happen?
eliezer: Ah, um...
stranger: Yes, he does. He wouldn’t have considered it in exactly those words if I hadn’t put it that way—not just because it’s ridiculously specific, but because Eliezer Yudkowsky doesn’t think in terms like that in advance of encountering the actual fact. He would consider it a “specific fantasy” that was threatening to drain away his emotional energy. But if it did happen, he would afterward say that he had achieved an outcome such that around 10% of his probability mass “would have been” in outcomes like that one or better, though he would worry about being hindsight-biased.
pat: I think a reasonable probability for an outcome like that would be more like 0.1%, and even that is being extremely generous!
eliezer: “Outside viewers” sure seem to tell me that a lot whenever I try to do anything interesting. I’m actually kind of surprised to hear you say that, though. I mean, my basic hypothesis for how the “outside view” thing operates is that it’s an expression of incredulity that can be leveled against any target by cherry-picking a reference class that predicts failure. One then builds an inescapable epistemic trap around that reference class by talking about the Dunning-Kruger effect and the dangers of inside-viewing. But trying to write Harry Potter fanfiction, even unusually good Harry Potter fanfiction, should sound to most people like it’s not high-status. I would expect people to react mainly to the part about the IMO gold medalist, even though the base rate for being an IMO gold medalist is higher than the base rate for authoring the most-reviewed Harry Potter fanfiction.
pat: Have you ever even tried to write Harry Potter fanfiction before? Do you know any of the standard awards that help publicize the best Harry Potter fan works or any of the standard sites that recommend them? Do you have any idea what the vast majority of the audience for Harry Potter fanfiction wants? I mean, just the fact that you’re publishing on FanFiction.Net is going to turn off a lot of people; the better stories tend to be hosted at ArchiveOfOurOwn.Org or on other, more specialized sites.
eliezer: Oh. I see. You do know about the pre-existing online Harry Potter fanfiction community, and you’re involved in it. You actually have a pre-existing status hierarchy built up in your mind around Harry Potter fanfiction. So when the Masked Stranger talks about Methods becoming the most popular Harry Potter fanfiction ever, you really do hear that as an overreaching status-claim, and you do that thing that makes an arbitrary proposition sound very improbable using the “outside view.”
pat: I don’t think the outside view, or reference class forecasting, can make arbitrary events sound very improbable. I think it makes events that won’t actually happen sound very improbable. As for my prior acquaintance with the community—how is that supposed to devalue my opinions? I have domain expertise. I have some actual idea of how many thousands of authors, including some very good authors, are trying to write Harry Potter fanfiction, only one of whom can author the most-reviewed story. And I’ll ask again, did you bother to acquire any idea of how this community actually works? Can you name a single annual award that’s given out in the Harry Potter fanfiction community?
eliezer: Um... not off the top of my head.
pat: Have you asked any of the existing top Harry Potter fanfiction authors to review your proposed plot, or your proposed story ideas? Like Nonjon, author of A Black Comedy? Or Sarah1281 or JBern or any of the other authors who have created multiple works widely acknowledged as excellent?
eliezer: I must honestly confess, although I’ve read those authors and liked their stories, that thought never even crossed my mind as a possible action.
pat: So you haven’t consulted anyone who knows more about Harry Potter fandom than you do.
eliezer: Nope.
pat: You have not written any prior Harry Potter fanfiction—not even a short story.
eliezer: Correct.
pat: You have made no previous effort to engage with the existing community of people who read or write Harry Potter fanfiction, or learn about existing gatekeepers on which the success of your story will depend.
eliezer: I’ve read some of the top previous Harry Potter fan works, since I enjoyed reading them. That, of course, is why the story idea popped into my head in the first place.
pat: What would you think of somebody who’d read a few popular physics books and wanted to be the world’s greatest physicist?
stranger: (aside) It appears to me that since the “outside view” as usually invoked is really about status hierarchy, signs of disrespecting the existing hierarchy will tend to provoke stronger reactions, and disrespectful-seeming claims that you can outperform some benchmark will be treated as much larger factors predicting failure than respectful-seeming claims that you can outperform an equivalent benchmark. It seems that physics crackpots feel relevantly analogous here because crackpots aren’t just epistemically misguided—that would be tragicomic, but it wouldn’t evoke the same feelings of contempt or disgust. What distinguishes physics crackpots is that they’re epistemically misguided in ways that disrespect high-status people on an important hierarchy—physicists. This feels like a relevant reference class for understanding other apparent examples of disrespectfully claiming to be high-status, because the evoked feeling is similar even if the phenomena differ in other ways.
eliezer: If you want to be a great physicist, you have to find the true law of physics, which is already out there in the world and not known to you. This isn’t something you can realistically achieve without working alongside other physicists, because you need an extraordinarily specific key to fit into this extraordinarily specific lock. In contrast, there are many possible books that would succeed over all past Harry Potter fanfiction, and you don’t have to build a particle accelerator to figure out which one to write.
stranger: I notice that when you try to estimate the difficulty of becoming the greatest physicist ever, Eliezer, you try to figure out the difficulty of the corresponding cognitive problem. It doesn’t seem to occur to you to focus on the fame.
pat: Eliezer, you seem to be deliberately missing the point of what’s wrong with reading a few physics books and then trying to become the world’s greatest physicist. Don’t you see that this error has the same structure as your Harry Potter pipe dream, even if the mistake’s magnitude is greater? That a critic would say the same sort of things to them as I am saying to you? Yes, becoming the world’s greatest physicist is much more difficult. But you’re trying to do this lesser impossible task in your off-hours because you think it will be easy.
eliezer: In the success scenario the Masked Stranger described, I would invest more effort into later chapters because it would have proven to be worth it.
stranger: Hey, Pat? Did you know that Eliezer hasn’t actually read the original Harry Potter books four through six, just watched the movies? And even after the book starts to take off, he still won’t get around to reading them.
pat: (incoherent noises)
eliezer: Um... look, I read books one through three when they came out, and later I tried reading book four. The problem was, I’d already read so much Harry Potter fanfiction by then that I was used to thinking of the Potterverse as a place for grown-up stories, and this produced a state change in my brain, so when I tried to read Harry Potter and the Goblet of Fire it didn’t feel right. But I’ve read enough fanfiction based in the Potterverse that I know the universe very well. I can tell you the name of Fleur Delacour’s little sister. In fact, I’ve read an entire novel about Gabrielle Delacour. I just haven’t read all the original books.
stranger: And when that’s not good enough, Eliezer consults the Harry Potter Wikia to learn relevant facts from canon. So you see he has all the knowledge he thinks he needs.
pat: (more incoherent noises)
eliezer: ...why did you tell Pat that, Masked Stranger?
stranger: Because Pat will think it’s a tremendously relevant fact for predicting your failure. This illustrates a critical life lesson about the difference between making obeisances toward a field by reading works to demonstrate social respect, and trying to gather key knowledge from a field so you can advance it. The latter is necessary for success; the former is primarily important insofar as public relations with gatekeepers is important. I think that people who aren’t status-blind have a harder time telling the difference.
pat: It’s true that I feel a certain sense of indignation—of, indeed, J. K. Rowling and the best existing Harry Potter fanfiction writers being actively disrespected—when you tell me that Eliezer hasn’t read all of the canon books and that he thinks he’ll make up for it by consulting a wiki.
eliezer: Well, if I can try to repair some of the public relations damage: If I thought I could write children’s books as popular as J. K. Rowling’s originals, I would be doing that instead. J. K. Rowling is now a billionaire, plus she taught my little sister to enjoy reading. People who trivialize that as “writing children’s books” obviously have never tried to write anything themselves, let alone children’s books. Writing good children’s literature is hard—which is why Methods is going to be aimed at older readers. Contrary to the model you seem to be forming of me, I have a detailed model of my own limitations as well as my current capabilities, and I know that I am not currently a good enough author to write children’s books.
pat: I can imagine a state of affairs where I would estimate someone to have an excellent chance of writing the best Harry Potter fanfiction ever made, even after reading only the first three canon books—say, if Neil Gaiman tried it. (Though Neil Gaiman, I’m damned sure, just would read the original canon books.) Do you think you’re as good as Neil Gaiman?
eliezer: I don’t expect to ever have enough time to invest in writing to become as good as Neil Gaiman.
pat: I’ve read your Three Worlds Collide, which I think is your best story, and I’m aware that it was mentioned favorably by a Hugo-award-winning author, Peter Watts. But I don’t think Three Worlds Collide is on the literary level of, say, the fanfiction Always and Always Part 1: Backwards With Purpose. So what feats of writing have you already performed that make you think your project has a 10% chance of becoming the most-reviewed Harry Potter fanfiction in existence?
eliezer: What you’re currently doing is what I call “demanding to see my hero license.” Roughly, I’ve declared my intention to try to do something that’s in excess of what you think matches my current social standing, and you want me to show that I already have enough status to do it.
pat: Ad hominem; you haven’t answered my question. I don’t see how, on the knowledge you presently have and on the evidence already available, you can possibly justify giving yourself a 10% probability here. But let me make sure, first, that we’re using the same concepts. Is that “10%” supposed to be an actual well-calibrated probability?
eliezer: Yes, it is. If I interrogate my mind about betting odds, I think I’d take your money at 20:1—like, if you offered me $20 against $1 that the fanfiction wouldn’t succeed—and I’d start feeling nervous about betting the other way at $4 against $1, where you’ll pay out $4 if the fanfiction succeeds in exchange for $1 if it doesn’t. Splitting the difference at somewhere near the geometric mean, we could call that 9:1 odds.
pat: And do you think you’re well-calibrated? Like, things you assign 9:1 odds should happen 9 out of 10 times?
eliezer: Yes, I think I could make 10 statements of this difficulty that I assign 90% probability, and be wrong on average about once. I haven’t tested my calibration as extensively as some people in the rationalist community, but the last time I took a CFAR calibration-testing sheet with 10 items on them and tried to put 90% credibility intervals on them, I got exactly one true value outside my interval. Achieving okay calibration, with a bit of study and a bit of practice, is not anywhere near as surprising as outside-view types make it out to be.
stranger: (aside) Eliezer-2010 doesn’t use PredictionBook as often as Gwern Branwen, doesn’t play calibration party games as often as Anna Salamon and Carl Shulman, and didn’t join Philip Tetlock’s study on superprediction. But I did make bets whenever I had the opportunity, and still do; and I try to set numeric odds whenever I feel uncertain and know I’ll find out the true value shortly.
I recently saw a cryptic set of statements on my refrigerator’s whiteboard about a “boiler” and various strange numbers and diagrams, which greatly confused me for five seconds before I hypothesized that they were notes about Brienne’s ongoing progress through the game Myst. Since I felt uncertain, but could find out the truth soon, I spent thirty seconds trying to tweak my exact probability estimate of these being notes for Brienne’s game. I started with a 90% “first pass” probability that they were Myst notes, which felt obviously overconfident, so I adjusted that down to 80% or 4:1. Then I thought about how there might be unforeseen other compact explanations for the cryptic words on the whiteboard and adjusted down to 3:1. I then asked Brienne, and learned that it was in fact about her Myst game. I then did a thirty-second “update meditation” on whether perhaps it wasn’t all that probable that there would be some other compact explanation for the cryptic writings; so maybe once the writings seemed explained away, I should have been less worried about unforeseen compact alternatives.
But I didn’t meditate on it too long, because it was just one sample out of my life, and the point of experiences like that is that you have a lot of them, and update a little each time, and eventually the experience accumulates. Meditating on it as much as I’m currently doing by writing about it here would not be good practice in general. (Those of you who have a basic acquaintance with neural networks and the delta rule should recognize what I’m trying to get my brain to do here.) I feel guilty about not betting more systematically, but given my limited supply of spoons, this kind of informal and opportunistic but regular practice is about all that I’m likely to actually do, as opposed to feel guilty about not doing.
As I do my editing pass on this document, I more recently assigned 5:1 odds against two characters on House of Cards having sex, who did in fact have sex; and that provides a bigger poke of adjustment against overconfidence. (According to the delta rule, this was a bigger error.)
pat: But there are studies showing that even after being warned about overconfidence, reading a study about overconfidence, and being allowed to practice a bit, overconfidence is reduced but not eliminated—right?
eliezer: On average across all subjects, overconfidence is reduced but not eliminated. That doesn’t mean that in every individual subject, overconfidence is reduced but not eliminated.
pat: What makes you think you can do better than average?
stranger: ...
eliezer: What makes me think I could do better than average is that I practiced much more than those subjects, and I don’t think the level of effort put in by the average subject, even a subject who’s warned about overconfidence and given one practice session, is the limit of human possibility. And what makes me think I actually succeeded is that I checked. It’s not like there’s this “reference class” full of overconfident people who hallucinate practicing their calibration and hallucinate discovering that their credibility intervals have started being well-calibrated.
stranger: I offer some relevant information that I learned from Sarah Constantin’s “Do Rational People Exist?”: Stanovich and West (1997) found that 88% of study participants were systematically overconfident, which means that they couldn’t demonstrate overconfidence for the remaining 12%. And this isn't too surprising; Stanovich and West (1998) note a number of other tests where around 10% of undergraduates fail to exhibit this or that bias.
eliezer: Right. So the question is whether I can, with some practice, make myself as non-overconfident as the top 10% of college undergrads. This… does not strike me as a particularly harrowing challenge. It does require effort. I have to consciously work to expand my credibility intervals past my first thought, and I expect that college students who outperform have to do the same. The potential to do better buys little of itself; you have to actually put in the effort. But when I think I’ve expanded my intervals enough, I stop.
ii. Success factors and belief sharing
pat: So you actually think that you’re well-calibrated in assigning 9:1 odds for Methods failing versus succeeding, to the extreme levels assigned by the Masked Stranger. Are you going to argue that I ought to widen my confidence intervals for how much success Harry Potter and the Methods of Rationality might enjoy, in order to avoid being overconfident myself?
eliezer: No. That feels equivalent to arguing that you shouldn’t assign a 0.1% probability to Methods succeeding because 1,000:1 odds are too extreme. I was careful not to put it that way, because that isn’t a valid argument form. That’s the kind of thinking which leads to papers like Ord, Hillerbrand, and Sandberg’s “Probing the Improbable,” which I think are wrong. In general, if there are 500,000 fan works, only one of which can have the most reviews, then you can’t pick out one of them at random and say that 500,000:1 is too extreme.
pat: I’m glad you agree with this obvious point. And I'm not stupid; I recognize that your stories are better than average. 90% of Harry Potter fanfiction is crap by Sturgeon’s Law, and 90% of the remaining 10% is going to be uninspired. That leaves maybe 5,000 fan works that you do need to seriously compete with. And I’ll even say that if you’re trying reasonably hard, you can end up in the top 10% of that pool. That leaves a 1-in-500 chance of your being the best Harry Potter author on fanfiction.net. We then need to factor in the other Harry Potter fanfiction sites, which have fewer works but much higher average quality. Let’s say it works out to a 1-in-1,000 chance of yours being the best story ever, which I think is actually very generous of me, given that in a lot of ways you seem ridiculously unprepared for the task—um, are you all right, Masked Stranger?
stranger: Excuse me, please. I’m just distracted by the thought of a world where I could go on fanfiction.net and find 1,000 other stories as good as Harry Potter and the Methods of Rationality. I’m thinking of that world and trying not to cry. It’s not that I can’t imagine a world in which your modest-sounding Fermi estimate works correctly—it’s just that the world you’re describing looks so very different from this one.
eliezer: Pat, I can see where you’re coming from, and I’m honestly not sure what I can say to you about it, in advance of being able to show you the book.
pat: What about what I tried to say to you? Does it influence you at all? The method I used was rough, but I thought it was a very reasonable approach to getting a Fermi estimate, and if you disagree with the conclusion, I would like to know what further factors make your own Fermi estimate work out to 10%.
stranger: You underestimate the gap between how you two think. It wouldn’t occur to Eliezer to even consider any one of the factors you named, while he was making his probability estimate of 10%.
eliezer: I have to admit that that’s true.
pat: Then what do you think are the most important factors in whether you’ll succeed?
eliezer: Hm. Good question. I’d say... whether I can maintain my writing enthusiasm, whether I can write fast enough, whether I can produce a story that’s really as good as I seem to be envisioning, whether I’ll learn as I go and do better than I currently envision. Plus a large amount of uncertainty in how people will actually react to the work I have in my head if I can actually write it.
pat: Okay, so that’s five key factors. Let’s estimate probabilities for each one. Suppose we grant that there’s an 80% chance of your maintaining enthusiasm, a 50% chance that you’ll write fast enough—though you’ve had trouble with that before; it took you fully a year to produce Three Worlds Collide, if I recall correctly. A 25% probability that you can successfully write down this incredible story that seems to be in your mind—I think this part almost always fails for authors, and is almost certainly the part that will fail for you, but we’ll give it a one-quarter probability anyway, to be generous and steelman the whole argument. Then a 50% probability that you’ll learn fast enough to not be torpedoed by the deficits you already know you have. Now even without saying anything about audience reactions (really, you’re going to try to market cognitive science and formal epistemology to Harry Potter fans?), and even though I’m being very generous here, multiplying these probabilities together already gets us to the 5% level, which is less than the 10% you estimated—
stranger: Wrong.
pat: … Wrong? What do you mean?
stranger: Let’s consider the factors that might be involved in your above reasoning not being wrong. Let us first estimate the probability that any given English-language sentence will turn out to be true. Then, we have to consider the probability that a given argument supporting some conclusion will turn out to be free of fatal biases, the probability that someone who calls an argument “wrong” will be mistaken—
pat: Eliezer, if you disagree with my conclusions, then what’s wrong with my probabilities?
eliezer: Well, for a start: Whether I can maintain my writing speed is not conditionally independent of whether I maintain my enthusiasm. The audience reaction is not conditionally independent of whether I maintain my writing speed. Whether I’m learning things is not conditionally independent of whether I maintain my enthusiasm. Your attempt to multiply all those numbers together was gibberish as probability theory.
pat: Okay, let’s ask about the probability that you maintain writing speed, given that you maintain enthusiasm—
eliezer: Do you think that your numbers would have actually been that different, if that had been the question you’d initially asked? I’m pretty sure that if you’d thought to phrase the question as “the probability given that...” and hadn’t first done it the other way, you would have elicited exactly the same probabilities from yourself, driven by the same balance of mental forces—picking something low that sounds reasonable, or something like that. And the problem of conditional dependence is far from the only reason I think “estimate these probabilities, which I shall multiply together” is just a rhetorical trick.
pat: A rhetorical trick?
eliezer: By picking the right set of factors to “elicit,” someone can easily make people’s “answers” come out as low as desired. As an example, see van Boven and Epley’s “The Unpacking Effect in Evaluative Judgments.” The problem here is that people... how can I compactly phrase this... people tend to assign median-tending probabilities to any category you ask them about, so you can very strongly manipulate their probability distributions by picking the categories for which you “elicit” probabilities. Like, if you ask car mechanics about the possible causes of a car not starting—experienced car mechanics, who see the real frequencies on a daily basis!—and you ask them to assign a probability to “electrical system failures” versus asking separately for “dead battery,” “alternator problems,” and “spark plugs,” the unpacked categories get collectively assigned much greater total probability than the packed category.
pat: But perhaps, when I’m unpacking things that can potentially go wrong, I’m just compensating for the planning fallacy and how people usually aren’t pessimistic enough—
eliezer: Above all, the problem with your reasoning is that the stated outcome does not need to be a perfect conjunction of those factors. Not everything on your list has to go right simultaneously for the whole process to work. You have omitted other disjunctive pathways to the same end. In your universe, nobody ever tries harder or repairs something after it goes wrong! I have never yet seen an informal conjunctive breakdown of an allegedly low probability in which the final conclusion actually required every one of the premises. That’s why I’m always careful to avoid the “I shall helpfully break down this proposition into a big conjunction and ask you to assign each term a probability” trick.
Its only real use, at least in my experience, is that it’s a way to get people to feel like they’ve “assigned” probabilities while you manipulate the setup to make the conclusion have whatever probability you like—it doesn’t have any role to play in honest conversation. Out of all the times I’ve seen it used, to support conclusions I endorse as well as ones I reject, I’ve never once seen it actually work as a way to better discover truth. I think it’s bad epistemology that sticks around because it sounds sort of reasonable if you don’t look too closely.
pat: I was working with the factors you picked out as critical. Which specific parts of my estimate do you disagree with?
stranger: (aside) The multiple-stage fallacy is an amazing trick, by the way. You can ask people to think of key factors themselves and still manipulate them really easily into giving answers that imply a low final answer, because so long as people go on listing things and assigning them probabilities, the product is bound to keep getting lower. Once we realize that by continually multiplying out probabilities the product keeps getting lower, we have to apply some compensating factor internally so as to go on discriminating truth from falsehood.
You have effectively decided on the answer to most real-world questions as “no, a priori” by the time you get up to four factors, let alone ten. It may be wise to list out many possible failure scenarios and decide in advance how to handle them—that’s Murphyjitsu—but if you start assigning “the probability that X will go wrong and not be handled, conditional on everything previous on the list having not gone wrong or having been successfully handled,” then you’d better be willing to assign conditional probabilities near 1 for the kinds of projects that succeed sometimes—projects like Methods. Otherwise you’re ruling out their success a priori, and the “elicitation” process is a sham.
Frankly, I don’t think the underlying methodology is worth repairing. I don’t think it’s worth bothering to try to make a compensating adjustment toward higher probabilities. We just shouldn’t try to do “conjunctive breakdowns” of a success probability where we make up lots and lots of failure factors that all get informal probability assignments. I don’t think you can get good estimates that way even if you try to compensate for the predictable bias.
eliezer: I did list my own key factors, and I do feel doubt about whether they’ll work out. If I were really confident in them, I’d be assigning a higher probability than 10%. But besides having conditional dependencies, my factors also have disjunctive as well as conjunctive character; they don’t all need to go right and stay right simultaneously. I could get far enough into Methods to acquire an audience, suddenly lose my writing speed, and Methods could still end up ultimately having a large impact.
pat: So how do you manipulate those factors to arrive at an estimate of 10% probability of extreme success?
eliezer: I don’t. That’s not how I got my estimate. I found two brackets, 20:1 and 4:1, that I couldn’t nudge further without feeling nervous about being overconfident in one direction or the other. In other words, the same way I generated my set of ten credibility intervals for CFAR’s calibration test. Then I picked something in the logarithmic middle.
pat: So you didn’t even try to list out all the factors and then multiply them together?
eliezer: No.
pat: Then where the heck does your 10% figure ultimately come from? Saying that you got two other cryptic numbers, 20:1 and 4:1, and picked something in the geometric middle, doesn’t really answer the fundamental question.
stranger: I believe the technical term for the methodology is “pulling numbers out of your ass.” It’s important to practice calibrating your ass numbers on cases where you’ll learn the correct answer shortly afterward. It’s also important that you learn the limits of ass numbers, and don’t make unrealistic demands on them by assigning multiple ass numbers to complicated conditional events.
eliezer: I’d say I reached the estimate… by thinking about the object-level problem? By using my domain knowledge? By having already thought a lot about the problem so as to load many relevant aspects into my mind, then consulting my mind’s native-format probability judgment—with some prior practice at betting having already taught me a little about how to translate those native representations of uncertainty into 9:1 betting odds. I’m not sure what additional information you want here. If there’s a way to produce genuinely, demonstrably superior judgments using some kind of break-it-down procedure, I haven’t read about it in the literature and I haven’t practiced using it yet. If you show me that you can produce 9-out-of-10 correct 90% credible intervals, and your intervals are narrower than my intervals, and you got them using a break-it-down procedure, I’m happy to hear about it.
pat: So basically your 10% probability comes from inaccessible intuition.
eliezer: In this case? Yeah, pretty much. There’s just too little I can say to you about why Methods might work, in advance of being able to show you what I have in mind.
pat: If the reasoning inside your head is valid, why can’t it be explained to me?
eliezer: Because I have private information, frankly. I know the book I’m trying to create.
pat: Eliezer, I think one of the key insights you’re ignoring here is that it should be a clue to you that you think you have incommunicable reasons for believing your Methods of Rationality project can succeed. Isn’t being unable to convince other people of their prospects of success just the sort of experience that crackpots have when they set out to invent bad physics theories? Isn’t this incommunicable intuition just the sort of justification that they would try to give?
eliezer: But the method you’re using—the method you’re calling “reference class forecasting”—is too demanding to actually detect whether someone will end up writing the world’s most reviewed Harry Potter fanfiction, whether that’s me or someone else. The fact that a modest critic can’t be persuaded isn’t Bayesian discrimination between things that will succeed and things that will fail; it isn’t evidence.
pat: On the contrary, I would think it very reasonable if Nonjon told me that he intended to write the most-reviewed Harry Potter fanfiction. Nonjon’s A Black Comedy is widely acknowledged as one of the best stories in the genre, Nonjon is well-placed in influential reviewing and recommending communities—Nonjon might not be certain to write the most reviewed story ever, but he has legitimate cause to think that he is one of the top contenders for writing it.
stranger: It's interesting how your estimates of success probabilities can be well summarized by a single quantity that correlates very well with how respectable a person is within a subcommunity.
pat: Additionally, even if my demands were unsatisfiable, that wouldn’t necessarily imply a hole in my reasoning. Nobody who buys a lottery ticket can possibly satisfy me that they have good reason to believe they’ll win, even the person who does win. But that doesn’t mean I’m wrong in assigning a low success probability to people who buy lottery tickets.
Nonjon may legitimately have a 1-in-10 lottery ticket. Neil Gaiman might have 2-in-3. Yours, as I’ve said, is probably more like 1-in-1,000, and it’s only that high owing to your having already demonstrated some good writing abilities. I’m not even penalizing you for the fact that your plan of offering explicitly rational characters to the Harry Potter fandom sounds very unlike existing top stories. I might be unduly influenced by the fact that I like your previous writing. But your claim to have incommunicable advance knowledge that your lottery ticket will do better than this by a factor of 100 seems very suspicious to me. Valid evidence should be communicable between people.
stranger: “I believe myself to be writing a book on economic theory which will largely revolutionize—not I suppose, at once but in the course of the next ten years—the way the world thinks about its economic problems. I can’t expect you, or anyone else, to believe this at the present stage. But for myself I don’t merely hope what I say,—in my own mind, I’m quite sure.” Lottery winner John Maynard Keynes to George Bernard Shaw, while writing The General Theory of Employment, Interest and Money.
eliezer: Come to think of it, if I do succeed with Methods, Pat, you yourself could end up in an incommunicable epistemic state relative to someone who only heard about me later through my story. Someone like that might suspect that I'm not a purely random lottery ticket winner, but they won't have as much evidence to that effect as you. It's a pretty interesting and fundamental epistemological issue.
pat: I disagree. If you have valid introspective evidence, then talk to me about your state of mind. On my view, you shouldn’t end up in a situation where you update differently on what your evidence “feels like to you” than what your evidence “sounds like to other people”; both you and other people should just do the second update.
stranger: No, in this scenario, in the presence of other suspected biases, two human beings really can end up in incommunicable epistemic states. You would know that “Eliezer wins” had genuinely been singled out in advance as a distinguished outcome, but the second person would have to assess this supposedly distinguished outcome with the benefit of hindsight, and they may legitimately never trust their hindsight enough to end up in the same mental state as you.
You're right, Pat, that completely unbiased agents who lack truly foundational disagreements on priors should never end up in this situation. But humans can end up in it very easily, it seems to me. Advance predictions have special authority in science for a reason: hindsight bias makes it hard to ever reach the same confidence in a prediction that you only hear about after the fact.
pat: Are you really suggesting that the prevalence of cognitive bias means you should be more confident that your own reasoning is correct? My epistemology seems to be much more straightforward than yours on these matters. Applying the “valid evidence should be communicable” rule to this case: A hypothetical person who saw Eliezer Yudkowsky write the Less Wrong Sequences, heard him mention that he assigned a non-tiny probability to succeeding in his Methods ambitions, and then saw him succeed at Methods should just realize what an external observer would say to them about that. And what they’d say is: you just happened to be the lucky or unlucky relatives of a lottery ticket buyer who claimed in advance to have psychic powers, and then happened to win.
eliezer: This sounds a lot like a difficulty I once sketched out for the “method of imaginary updates.” Human beings aren’t logically omniscient, so we can’t be sure we’ve reasoned correctly about prior odds. In advance of seeing Methods succeed, I can see why you’d say that, on your worldview, if it did happen then it would just be a 1000:1 lottery ticket winning. But if that actually happened, then instead of saying, “Oh my gosh, a 1000:1 event just occurred,” you ought to consider instead that the method you used to assign prior probabilities was flawed. This is not true about a lottery ticket, because we’re extremely sure about how to assign prior probabilities in that case—and by the same token, in real life neither of us will actually see our friends winning the lottery.
pat: I agree that if it actually happens, I would reconsider your previous arguments rather than insisting that I was correct about prior odds. I’m happy to concede this point because I am very, very confident that it won’t actually happen. The argument against your success in Harry Potter fanfiction seems to me about as strong as any argument the outside-view perspective might make.
stranger: Oh, we aren’t disputing that.
pat: You aren’t?
stranger: That’s the whole point, from my perspective. If modest epistemology sounds persuasive to you, then it’s trivial to invent a crushing argument against any project that involves doing something important that hasn’t been done in the past. Any project that’s trying to exceed any variety of civilizational inadequacy is going to be ruled out.
pat: Look. You cannot just waltz into a field and become its leading figure on your first try. Modest epistemology is just right about that. You are not supposed to be able to succeed when the odds against you are like those I have described. Maybe out of a million contenders, someone will succeed by luck when the modest would have predicted their failure, but if we’re batting 999,999 out of 1,000,000 I say we’re doing pretty well. Unless, of course, Eliezer would claim that the project of writing this new Harry Potter fanfiction is so important that a 0.0001% chance of success is still worth it—
eliezer: I never say that. Ever. If I ever say that you can just shoot me.
pat: Then why are you not responding to the very clear, very standard, very obvious reasons I have laid out to think that you cannot do this? I mean, seriously, what is going through your head right now?
eliezer: A helpless feeling of being unable to communicate.
stranger: Grim amusement.
pat: Then I’m sorry, Mr. Eliezer Yudkowsky, but it seems to me that you are being irrational. You aren’t even trying to hide it very hard.
eliezer: (sighing) I can imagine why it would look that way to you. I know how to communicate some of the thought patterns and styles that I think have served me well, that I think generate good predictions and policies. The other patterns leave me with this helpless feeling of knowing but being unable to speak. This conversation has entered a dependency on the part that I know but don’t know how to say.
pat: Why should I believe that?
eliezer: If you think the part I did figure out how to say was impressive enough. That was hidden purpose #7 of the Less Wrong Sequences—to provide an earnest-token of all the techniques I couldn’t show. All I can tell you is that everything you’re so busy worrying about is not the correct thing for me to be thinking about. That your entire approach to the problem is wrong. It is not just that your arguments are wrong. It is that they are about the wrong subject matter.
pat: Then what’s the right subject matter?
eliezer: That’s what I’m having trouble saying. I can say that you ought to discard all thoughts from your mind about competing with others. The others who’ve come before you are like probes, flashes of sound, pingbacks that give you an incomplete sonar of your problem’s difficulty. Sometimes you can swim past the parts of the problem that tangled up other people and enter a new part of the ocean. Which doesn’t actually mean you’ll succeed; all it means is that you’ll have very little information about which parts are difficult. There often isn’t actually any need to think at all about the intrinsic qualities of your competition—like how smart or motivated or well-paid they are—because their work is laid out in front of you and you can just look at the quality of the work.
pat: Like somebody who predicts hyperinflation, saying all the while that they’re free to disregard conventional economists because of how those idiot economists think you can triple the money supply without getting inflation?
eliezer: I don’t really know what goes through someone else’s mind when that happens to them. But I don’t think that telling them to be more modest is a fix. Telling somebody to shut up and respect academics is not a generally valid line of argumentation because it doesn’t distinguish mainstream economics (which has relatively high scholarly standards) from mainstream nutrition science (which has relatively low scholarly standards). I’m not sure there is any robust way out except by understanding economics for yourself, and to the extent that’s true, I ought to advise our hypothetical ill-informed contrarian to read a lot of economics blogs and try to follow the arguments, or better yet read an economics textbook. I don’t think that people sitting around and anxiously questioning themselves and wondering whether they’re too audacious is a route out of that particular hole—let alone the hole on the other side of the fence.
pat: So your meta-level epistemology is to remain as ultimately inaccessible to me as your object-level estimates.
eliezer: I can understand why you’re skeptical.
pat: I somehow doubt that you could pass an Ideological Turing Test on my point of view.
stranger: (smiling) Oh, I think I’d do pretty well at your ITT.
eliezer: Pat, I understand where your estimates are coming from, and I’m sure that your advice is truly meant to be helpful to me. But I also see that advice as an expression of a kind of anxiety which is not at all like the things I need to actually think about in order to produce good fiction. It’s a wasted motion, a thought which predictably will not have helped in retrospect if I succeed. How good I am relative to other people is just not something I should spend lots of time obsessing about in order to make Methods be what I want it to be. So my thoughts just don’t go there.
pat: This notion, “that thought will predictably not have helped in retrospect if I succeed,” seems very strange to me. It helps precisely because we can avoiding wasting our effort on projects which are unlikely to succeed.
stranger: Sounds very reasonable. All I can say in response is: try doing it my way for a day, and see what happens. No thoughts that predictably won’t have been helpful in retrospect, in the case that you succeed at whatever you’re currently trying to do. You might learn something from the experience.
eliezer: The thing is, Pat... even answering your objections and defending myself from your variety of criticism trains what look to me like unhealthy habits of thought. You’re relentlessly focused on me and my psychology, and if I engage with your arguments and try to defend myself, I have to focus on myself instead of my book. Which gives me that much less attention to spend on sketching out what Professor Quirrell will do in his first Defense lesson. Worse, I have to defend my decisions, which can make them harder to change later.
stranger: Consider how much more difficult it will be for Eliezer to swerve and drop his other project, The Art of Rationality, if it fails after he has a number of (real or internal) conversation like this—conversations where he has to defend all the reasons why it's okay for him to think that he might write a nonfiction bestseller about rationality. This is why it’s important to be able to casually invoke civilizational inadequacy. It’s important that people be allowed to try ambitious things without feeling like they need to make a great production out of defending their hero license.
eliezer: Right. And... the mental motions involved in worrying what a critic might think and trying to come up with defenses or concessions are different from the mental motions involved in being curious about some question, trying to learn the answer, and coming up with tests; and it’s different from how I think when I’m working on a problem in the world. The thing I should be thinking about is just the work itself.
pat: If you were just trying to write okay Harry Potter fanfiction for fun, I might agree with you. But you say you can produce the best fanfiction. That’s a whole different ball game—
eliezer: No! The perspective I’m trying to show you, the way it works in the inside of my head, is that trying to write good fanfiction, and the best fanfiction, are not different ball games. There’s an object level, and you try to optimize it. You have an estimate of how well you can optimize it. That’s all there ever is.
iii. Social heuristics and problem importance, tractability, and neglectedness
pat: A funny thought has just occurred to me. That thing where you’re trying to work out the theory of Friendly AI—
eliezer: Let me guess. You don’t think I can do that either.
pat: Well, I don’t think you can save the world, of course! (laughs) This isn’t a science fiction book. But I do see how you can reasonably hope to make an important contribution to the theory of Friendly AI that ends up being useful to whatever group ends up developing general AI. What’s interesting to note here is that the scenario the Masked Stranger described, the class of successes you assigned 10% aggregate probability, is actually harder to achieve than that.
stranger: (smiling) It really, really, really isn’t.
I'll mention as an aside that talk of “Friendly” AI has been going out of style where I’m from. We’ve started talking instead in terms of “aligning smarter-than-human AI with operators’ goals,” mostly because “AI alignment” smacks less of anthropomorphism than “friendliness.”
eliezer: Alignment? Okay, I can work with that. But Pat, you’ve said something I didn’t expect you to say and gone outside my current vision of your Ideological Turing Test. Please continue.
pat: Okay. Contrary to what you think, my words are not fully general counterarguments that I launch against just anything I intuitively dislike. They are based on specific, visible, third-party-assessable factors that make assertions believable or unbelievable. If we leave aside inaccessible intuitions and just look at third-party-visible factors, then it is very clear that there’s a huge community of writers who are explicitly trying to create Harry Potter fanfiction. This community is far larger and has far more activity—by every objective, third-party metric—than the community working on issues related to alignment or friendliness or whatever. Being the best writer in a much larger community is much more improbable than your making a significant contribution to AI alignment when almost nobody else is working on that problem.
eliezer: The relative size of existing communities that you’ve just described is not a fact that I regard as important for assessing the relative difficulty of “making a key contribution to AI alignment” versus “getting Methods to the level described by the Masked Stranger.” The number of competing fanfiction authors would be informative to me if I hadn’t already checked out the Harry Potter fan works with the best reputations. If I can see how strong the competition is with my own eyes, then that screens off information about the size of the community from my perspective.
pat: But surely the size of the community should give you some pause regarding whether you should trust your felt intuition that you could write something better than the product of so many other authors.
stranger: See, that meta-reasoning right there? That’s the part I think is going to completely compromise how people think about the world if they try to reason that way.
eliezer: Would you ask a juggler, in the middle of juggling, to suddenly start worrying about whether she’s in a reference class of people who merely think that they’re good at catching balls? It’s all just... wasted motion.
stranger: Social anxiety and overactive scrupulosity.
eliezer: Not what brains look like when they’re thinking productively.
pat: You’ve been claiming that the outside view is a fully general counterargument against any claim that someone with relatively low status will do anything important. I’m explaining to you why the method of trusting externally visible metrics and things that third parties can be convinced of says that you might make important contributions to AI alignment where nobody else is trying, but that you won’t write the most reviewed Harry Potter fanfiction where thousands of other authors are competing with you.
(A wandering bystander suddenly steps up to the group, interjecting.)
bystander: Okay, no. I just can't hold my tongue anymore.
pat: Huh? Who are you?
bystander: I am the true voice of modesty and the outside view!
I’ve been overhearing your conversation, and I’ve got to say—there’s no way it’s easier to make an important contribution to AI alignment than it is to write popular fanfiction.
eliezer: … That’s true enough, but who…?
bystander: The name’s Maude Stevens.
pat: Well, it's nice to make your acquaintance, Maude. I am always eager to hear about my mistakes, even from people with suspiciously relevant background information who randomly walk up to me in parks. What is my error on this occasion?
maude: All three of you have been taking for granted that if people don’t talk about “alignment” or “friendliness,” then their work isn’t relevant. But those are just words. When we take into account machine ethicists working on real-world trolley dilemmas, economists working on technological unemployment, computer scientists working on Asimovian agents, and so on, the field of competitors all trying to make progress on these issues becomes much, much larger.
pat: What? Is that true, Eliezer?
eliezer: Not to my knowledge—unless Maude is here from the NSA to tell me about some very interesting behind-closed-doors research. The examples Maude listed aren't addressing the technical issues I've been calling “friendliness.” Progress on those problems doesn’t help you with specifying preferences that you can reasonably expect to produce good outcomes even when the system is smarter than you and searching a much wider space of strategies than you can consider or check yourself. Or designing systems that are stable under self-modification, so that good properties of a seed AI are preserved as the agent gets smarter.
maude: And your claim is that no one else in the world is smart enough to notice any of this?
eliezer: No, that's not what I'm saying. Concerns like “how do we specify correct goals for par-human AI?” and “what happens when AI gets smart enough to automate AI research itself?” have been around for a long time, sort of just hanging out and not visibly shifting research priorities. So it's not that the community of people who have ever thought about superintelligence is small; and it's not that there are no ongoing lines of work on robustness, transparency, or security in narrow AI systems that will incidentally make it easier to align smarter-than-human AI. But the community of people who go into work every day and make decisions about what technical problems to tackle based on any extended thinking related to superintelligent AI is very small.
maude: What I’m saying is that you’re jumping ahead and trying to solve the far end of the problem before the field is ready to focus efforts there. The current work may not all bear directly on superintelligence, but we should expect all the significant progress on AI alignment to be produced by the intellectual heirs of the people presently working on topics like drone warfare and unemployment.
pat: (cautiously) I mean, if what Eliezer says is true—and I do think that Eliezer is honest, if often, by my standards, slightly crazy—then the state of the field in 2010 is just like it looks naively. There aren’t many people working on topics related to smarter-than-human AI, and Eliezer’s group and the Oxford Future of Humanity Institute are the only ones with a reasonable claim to be working on AI alignment. If Eliezer says that the problems of crafting a smarter-than-human AI to not kill everyone are not of a type with current machine ethics work, then I can buy that as plausible, though I’d want to hear others’ views on the issue before reaching a firm conclusion.
maude: But Eliezer’s field of competition is far wider than just the people writing ethics papers. Anyone working in machine learning, or indeed in any branch of computer science, might end up contributing to AI alignment.
eliezer: Um, that would certainly be great news to hear. The win state here is just “the problem gets solved”—
pat: Wait a second. I think you’re leaving the realm of what’s third-party objectively verifiable, Maude. That’s like saying that Eliezer has to compete with Stephen King because Stephen King could in principle decide to start writing Harry Potter fanfiction. If all these other people in AI are not working on the particular problems Eliezer is working on, whereas the broad community of Harry Potter fanfiction writers is competing directly with Eliezer on fiction-writing, then any reasonable third party should agree that the outside view counterargument applies very strongly to the second case, and much more weakly (if at all) to the first.
maude: So now fanfiction is supposed to be harder than saving the world? Seriously? Just no.
eliezer: Pat, while I disagree with Maude’s arguments, she does have the advantage of rationalizing a true conclusion rather than a false conclusion. AI alignment is harder.
pat: I’m not expecting you to solve the whole thing. But making a significant contribution to a sufficiently specialized corner of academia that very few other people are explicitly working on should be easier than becoming the single most successful figure in a field that lots of other people are working in.
maude: This is ridiculous. Fanfiction writers are simply not the same kind of competition as machine learning experts and professors at leading universities, any of whom could end up making far more impressive contributions to the cutting edge in AGI research.
eliezer: Um, advancing AGI research might be impressive, but unless it's AGI alignment it's—
pat: Have you ever tried to write fiction yourself? Try it. You’ll find it’s a heck of a lot harder than you seem to imagine. Being good at math does not qualify you to waltz in and—
(The Masked Stranger raises his hand and snaps his fingers. All time stops. Then the Masked Stranger looks over at Eliezer-2010 expectantly.)
eliezer: Um... Masked Stranger... do you have any idea what’s going on here?
stranger: Yes.
eliezer: Thank you for that concise and informative reply. Would you please explain what’s going on here?
stranger: Pat is thoroughly acquainted with the status hierarchy of the established community of Harry Potter fanfiction authors, which has its own rituals, prizes, politics, and so on. But Pat, for the sake of literary hypothesis, lacks an instinctive sense that it’s audacious to try to contribute work to AI alignment. If we interrogated Pat, we’d probably find that Pat believes that alignment is cool but not astronomically important, or that there are many other existential risks of equal stature. If Pat believed that long-term civilizational outcomes depended mostly on solving the alignment problem, as you do, then he would probably assign the problem more instinctive prestige—holding constant everything Pat knows about the object-level problem and how many people are working on it, but raising the problem’s felt status.
Maude, meanwhile, is the reverse: not acquainted with the political minutiae and status dynamics of Harry Potter fans, but very sensitive to the importance of the alignment problem. So to Maude, it’s intuitively obvious that making technical progress on AI alignment requires a much more impressive hero license than writing the world’s leading Harry Potter fanfiction. Pat doesn’t see it that way.
eliezer: But ideas in AI alignment have to be formalized; and the formalism needs to satisfy many different requirements simultaneously, without much room for error. It’s a very abstract, very highly constrained task because it has to put an informal problem into the right formal structure. When writing fiction, yes, I have to juggle things like plot and character and tension and humor, but that’s all still a much less constrained cognitive problem—
stranger: That kind of consideration isn’t likely to enter Pat or Maude’s minds.
eliezer: Does it matter that I intend to put far more effort into my research than into fiction-writing? If Methods doesn’t work the first time, I’ll just give up.
stranger: Sorry. Whether or not you’re allowed to do high-status things can’t depend on how much effort you say you intend to put in. Because “anyone could say that.” And then you couldn’t slap down pretenders—which is terrible.
eliezer: …… Is there some kind of organizing principle that makes all of this make sense?
stranger: I think the key concepts you need are civilizational inadequacy and status hierarchy maintenance.
eliezer: Enlighten me.
stranger: You know how Pat ended up calculating that there ought to be 1,000 works of Harry Potter fanfiction as good as Methods? And you know how I got all weepy visualizing that world? Imagine Maude as making a similar mistake. There’s a world in which some scruffy outsider like you wouldn’t be able to estimate a significant chance of making a major contribution to AI alignment, let alone help found the field, because people had been trying to do serious technical work on it since the 1960s, and were putting substantial thought, ingenuity, and care into making sure they were working on the right problems and using solid methodologies. Functional decision theory was developed in 1971, two years after Robert Nozick’s publication of “Newcomb’s Problem and Two Principles of Choice.” Everyone expects humane values to have high Kolmogorov complexity. Everyone understands why, if you program an expected utility maximizer with utility function 𝗨 and what you really meant is 𝘝, the 𝗨-maximizer has a convergent instrumental incentive to deceive you into believing that it is a 𝘝-maximizer. Nobody assumes you can “just pull the plug” on something much smarter than you are. And the world's other large-scale activities and institutions all scale up similarly in competence.
We could call this the Adequate World, and contrast it to the way things actually are. The Adequate World has a property that we could call inexploitability; or inexploitability-by-Eliezer. We can compare it to how you can’t predict a 5% change in Microsoft’s stock price over the next six months—take that property of S&P 500 stocks, and scale it up to a whole planet whose experts you can’t surpass, where you can’t find any knowable mistake. They still make mistakes in the Adequate World, because they’re not perfect. But they’re smarter and nicer at the group level than Eliezer Yudkowsky, so you can’t know which things are epistemic or moral mistakes, just like you can’t know whether Microsoft’s equity price is mistaken on the up-side or low-side on average.
eliezer: Okay... I can see how Maude’s conclusion would make sense in the Adequate World. But how does Maude reconcile the arguments that reach that conclusion with the vastly different world we actually live in? It’s not like Maude can say, “Look, it’s obviously already being handled!” because it obviously isn’t.
stranger: Suppose that you have an instinct to regulate status claims, to make sure nobody gets more status than they deserve.
eliezer: Okay...
stranger: This gives rise to the behavior you’ve been calling “hero licensing.” Your current model is that people have read too many novels in which the protagonist is born under the sign of a supernova and carries a legendary sword, and they don’t realize real life is not like that. Or they associate the deeds of Einstein with the prestige that Einstein has now, not realizing that prior to 1905, Einstein had no visible aura of destiny.
eliezer: Right.
stranger: Wrong. Your model of heroic status is that it ought to be a reward for heroic service to the tribe. You think that while of course we should discourage people from claiming this heroic status without having yet served the tribe, no one should find it intuitively objectionable to merely try to serve the tribe, as long as they’re careful to disclaim that they haven’t yet served it and don’t claim that they already deserve the relevant status boost.
eliezer: ... this is wrong?
stranger: It’s fine for “status-blind” people like you, but it isn’t how the standard-issue status emotions work. Simply put, there’s a level of status you need in order to reach up for a given higher level of status; and this is a relatively basic feeling for most people, not something that’s trained into them.
eliezer: But before 1905, Einstein was a patent examiner. He didn’t even get a PhD until 1905. I mean, Einstein wasn’t a typical patent examiner and he no doubt knew that himself, but someone on the outside looking at just his CV—
stranger: We aren’t talking about an epistemic prediction here. This is just a fact about how human status instincts work. Having a certain probability of writing the most popular Harry Potter fanfiction in the future comes with a certain amount of status in Pat’s eyes. Having a certain probability of making important progress on the AI alignment problem in the future comes with a certain amount of status in Maude’s eyes. Since your current status in the relevant hierarchy seems much lower than that, you aren’t allowed to endorse the relevant probability assignments or act as though you think they’re correct. You are not allowed to just try it and see what happens, since that already implies that you think the probability is non-tiny. The very act of affiliating yourself with the possibility is status-overreaching, requiring a slapdown. Otherwise any old person will be allowed to claim too much status—which is terrible.
eliezer: Okay. But how do we get from there to delusions of civilizational adequacy?
stranger: Backward chaining of rationalizations, perhaps mixed with some amount of just-world and status-quo bias. An economist would say “What?” if you presented an argument saying you ought to be able to double your money every year by buying and selling Microsoft stock in some simple pattern. The economist would then, quite reasonably, initiate a mental search to try to come up with some way that your algorithm doesn’t do what you thought it did, a hidden risk it contained, a way to preserve the idea of an inexploitable market in equities.
Pat tries to preserve the idea of an inexploitable-by-Eliezer market in fanfiction (since on a gut level it feels to him like you’re too low-status to be able to exploit the market), and comes up with the idea that there are a thousand other people who are writing equally good Harry Potter fanfiction. The result is that Pat hypothesizes a world that is adequate in the relevant respect. Writers’ efforts are cheaply converted into stories so popular that it’s just about humanly impossible to foreseeably write a more popular story; and the world’s adequacy in other regards ensures that any outsiders who do have a shot at outperforming the market, like Neil Gaiman, will already be rich in money, esteem, etc.
And the phenomenon generalizes. If someone believes that you don’t have enough status to make better predictions than the European Central Bank, they’ll have to believe that the European Central Bank is reasonably good at its job. Traditional economics doesn’t say that the European Central Bank has to be good at its job—an economist would tell you to look at incentives, and that the decisionmakers don’t get paid huge bonuses if Europe’s economy does better. For the status order to be preserved, however, it can’t be possible for Eliezer to outsmart the European Central Bank. For the world’s status order to be unchallengeable, it has to be right and wise; for it to be right and wise, it has to be inexploitable. A gut-level appreciation of civilizational inadequacy is a powerful tool for dispelling mirages like hero licensing and modest epistemology, because when modest epistemology backward-chains its rationalizations for why you can’t achieve big things, it ends up asserting adequacy.
eliezer: Civilization could be inexploitable in these areas without being adequate, though; and it sounds like you're saying that Pat and Maude mainly care about inexploitability.
stranger: You could have a world where poor incentives result in alignment research visibly being neglected, but where there’s no realistic way for well-informed and motivated individuals to strategically avoid those incentives without being outcompeted in some other indispensable resource. You could also have a world that’s inexploitable to you but exploitable to many other people. However, asserting adequacy reaffirms the relevant status hierarchy in a much stronger and more airtight way. The notion of an Adequate World more closely matches the intuitive sense that the world's most respectable and authoritative people are just untouchable—too well-organized, well-informed, and well-intentioned for just anybody to spot Moloch’s handiwork, whether or not they can do anything about it. And affirming adequacy in a way that sounds vaguely plausible generally requires less detailed knowledge of microeconomics, of the individuals trying to exploit the market, and of the specific problems they’re trying to solve than is the case for appeals to inexploitable inadequacy.
Civilizational inadequacy is the basic reason why the world as a whole isn’t inexploitable in the fashion of short-term equity price changes. The modest view, roughly, is that the world is inexploitable as far as you can predict, because you can never knowably know better than the experts.
eliezer: I... sort of get it? I still don’t understand Maude’s actual thought process here.
stranger: Let’s watch, then.
(The Masked Stranger raises his hands and snaps his fingers again, restarting time.)
pat: —take over literature because mere fiction writers are stupid.
maude: My good fellow, please take a moment to consider what you’re proposing. If the AI alignment problem were really as important as Eliezer claims, would he really be one of the only people working on it?
pat: Well, it sure looks like he is.
maude: Then the problem can’t be as important as he claims. The alternative is that a lone crank has identified an important issue that he and very few others are working on; and that means everyone else in his field is an idiot. Who does Eliezer think he is, to defy the academic consensus to the effect that AI alignment isn’t an interesting idea worth working on?
pat: I mean, there are all sorts of barriers I could imagine a typical academic running into if they wanted to work on AI alignment. Maybe it’s just hard to get academic grants for this kind of work.
maude: If it’s hard to get grants, then that’s because the grant-makers correctly recognize that this isn’t a priority problem.
pat: So now the state of academic funding is said to be so wise that people can’t find neglected research opportunities?
stranger: What person with grant-making power gets paid less in the worlds where alignment is important and yet neglected? If no one loses their bonuses or incurs any other perceptible cost, then you’re done. There’s no mystery here.
maude: All of the evidence is perfectly consistent with the hypothesis that there are no academic grants on offer because the grantmakers have made a thoughtful and informed decision that this is a pseudo-problem.
eliezer: I appreciate Pat’s defense, but I think I can better speak to this. Issues like intelligence explosion and the idea that there’s an important problem to be solved in AI goal systems, as I mentioned earlier, aren’t original to me. They're reasonably widely known, and people at all levels of seniority are often happy to talk about it face-to-face, though there’s disagreement about the magnitude of the risk and about what kinds of efforts are likeliest to be useful for addressing it. You can find it discussed in the most commonly used undergrad textbook in AI, Artificial Intelligence: A Modern Approach. You can’t claim that there’s a consensus among researchers that this is not an important problem.
maude: Then the grantmakers probably carefully looked into the problem and determined that the best way to promote humanity’s long-term welfare is to advance the field of AI in other ways, and only work on alignment once we reach some particular capabilities threshold. At that point, in all likelihood, funders plan to coordinate to launch a major field-wide research effort on alignment.
eliezer: How, exactly, could they reach a conclusion like that without studying the problem in any visible way? If the entire grantmaking community was able to arrive at a consensus to that effect, then where are the papers and analyses they used to reach their conclusion? What are the arguments? You sound like you’re talking about a silent conspiracy of competent grantmakers at a hundred different organizations, who have in some way collectively developed or gained access to a literature of strategic and technical research that Nick Bostrom and I have never heard about, establishing that the present-day research problems that look relevant and tractable aren’t so promising, and that capabilities will develop in a specific known direction at a particular rate that lends itself to late coordinated intervention.
Are you saying that despite all the researchers in the field casually discussing self-improving AI and Asimov Laws over coffee, there’s some hidden clever reason why studying this problem isn’t a good idea, which the grantmakers all arrived at in unison without leaving a paper trail about their decision-making process? I just... There are so many well-known and perfectly normal dysfunctions of grantmaking machinery and the academic incentive structure that allow alignment to be a critical problem without there necessarily being a huge academic rush to work on it. Instead you’re postulating a massive global conspiracy of hidden competence grounded in secret analyses and arguments. Why would you possibly go there?
maude: Because otherwise—
(The Stranger snaps his fingers again.)
stranger: Okay, Eliezer-2010, go ahead and answer. Why is Maude going there?
eliezer: Because... to prevent relatively unimpressive or unauthoritative-looking people from affiliating with important problems, from Maude’s perspective there can’t be knowably low-hanging research fruit. If there were knowably important problems that the grantmaking machinery and academic reward system had left untouched, then somebody like me could knowably be working on them. If there were a problem with the grantmakers, or a problem with academic incentives, at least of the kind that someone like me could identify, then it might be possible for someone unimportant like me to know that an important problem was not being worked on. The alleged state of academia and indeed the whole world has to backward chain to avoid there being low-hanging research fruit.
First Maude tried to argue that the problem is already well-covered by researchers in the field, as it would be in the Adequate World you described. When that position became difficult to defend, she switched to arguing that authoritative analysts have looked into the problem and collectively determined it’s a pseudo-problem. When that became difficult to defend, she switched to arguing that authoritative analysts have looked into the problem and collectively devised a better strategy involving delaying alignment research temporarily.
stranger: Very different hypotheses that share this property: they allow there to be something like an efficient market in high-value research, where individuals and groups that have high status in the standard academic system can't end up visibly dropping the ball.
Perhaps Maude's next proposal will be that top researchers have determined that the problem is easy. Perhaps there's a hidden consensus that AGI is centuries away. In my experience, people like Maude can be boundlessly inventive. There's always something.
eliezer: But why go to such lengths? No real economist would tell us to expect an efficient market here.
stranger: Sure, says Maude, the system isn’t perfect. But, she continues, neither are we perfect. All the grantmakers and tenure-granters are in an equivalent position to us, and doing their own part to actively try to compensate for any biases in the system they think they can see.
eliezer: But that’s visibly contradicted both by observation and by the economic theory of incentives.
stranger: Yes. But at the same time, it has to be assumed true. Because while experts can be wrong, we can also be wrong, right? Maybe we’re the ones with bad systemic incentives and only short-term rewards.
eliezer: But being inside a system with badly designed incentives is not the same as being unable to discern the truth of... oh, never mind.
This has all been very educational, Masked Stranger. Thanks.
stranger: Thanks for what, Eliezer? Showing you a problem isn’t much of a service if there’s nothing you can do to fix it. You’re no better off than you were in the original timeline.
eliezer: It still feels better to have some idea of what’s going on.
stranger: That, too, is a trap, as we’re both aware. If you need an elaborate theory to justify seeing the obvious, it will only become more elaborate and distracting as time goes on and you try harder and harder to reassure yourself. It’s much better to just take things at face value, without needing a huge argument to do so. If you must ignore someone’s advice, it’s better not to make up big elaborate reasons why you’re licensed to ignore it; that makes it easier to change your mind and take the advice later, if you happen to feel like it.
eliezer: True. Then why are you even saying these things to me?
stranger: I’m not. You never were the one to whom I was speaking, this whole time. That is the last lesson, that I didn’t ever say these things to myself.
(The Stranger turns upon his own heel three times, and was never there.)