SummaryIntelligence Explosion Microeconomics (pdf) is 40,000 words taking some initial steps toward tackling the key quantitative issue in the intelligence explosion, "reinvestable returns on cognitive investments": what kind of returns can you get from an investment in cognition, can you reinvest it to make yourself even smarter, and does this process die out or blow up? This can be thought of as the compact and hopefully more coherent successor to the AI Foom Debate of a few years back.

(Sample idea you haven't heard before:  The increase in hominid brain size over evolutionary time should be interpreted as evidence about increasing marginal fitness returns on brain size, presumably due to improved brain wiring algorithms; not as direct evidence about an intelligence scaling factor from brain size.)

I hope that the open problems posed therein inspire further work by economists or economically literate modelers, interested specifically in the intelligence explosion qua cognitive intelligence rather than non-cognitive 'technological acceleration'.  MIRI has an intended-to-be-small-and-technical mailing list for such discussion.  In case it's not clear from context, I (Yudkowsky) am the author of the paper.


I. J. Good's thesis of the 'intelligence explosion' is that a sufficiently advanced machine intelligence could build a smarter version of itself, which could in turn build an even smarter version of itself, and that this process could continue enough to vastly exceed human intelligence.  As Sandberg (2010) correctly notes, there are several attempts to lay down return-on-investment formulas intended to represent sharp speedups in economic or technological growth, but very little attempt has been made to deal formally with I. J. Good's intelligence explosion thesis as such.

I identify the key issue as returns on cognitive reinvestment - the ability to invest more computing power, faster computers, or improved cognitive algorithms to yield cognitive labor which produces larger brains, faster brains, or better mind designs.  There are many phenomena in the world which have been argued as evidentially relevant to this question, from the observed course of hominid evolution, to Moore's Law, to the competence over time of machine chess-playing systems, and many more.  I go into some depth on the sort of debates which then arise on how to interpret such evidence.  I propose that the next step forward in analyzing positions on the intelligence explosion would be to formalize return-on-investment curves, so that each stance can say formally which possible microfoundations they hold to be falsified by historical observations already made.  More generally, I pose multiple open questions of 'returns on cognitive reinvestment' or 'intelligence explosion microeconomics'.  Although such questions have received little attention thus far, they seem highly relevant to policy choices affecting the outcomes for Earth-originating intelligent life.

The dedicated mailing list will be small and restricted to technical discussants.

This topic was originally intended to be a sequence in Open Problems in Friendly AI, but further work produced something compacted beyond where it could be easily broken up into subposts.

Outline of contents:

1:  Introduces the basic questions and the key quantitative issue of sustained reinvestable returns on cognitive investments.

2:  Discusses the basic language for talking about the intelligence explosion, and argues that we should pursue this project by looking for underlying microfoundations, not by pursuing analogies to allegedly similar historical events.

3:  Goes into detail on what I see as the main arguments for a fast intelligence explosion, constituting the bulk of the paper with the following subsections:

  • 3.1: What the fossil record actually tells us about returns on brain size, given that most of the difference between Homo sapiens and Australopithecus was probably improved software.
  • 3.2: How to divide credit for the human-chimpanzee performance gap between "humans are individually smarter than chimpanzees" and "the hominid transition involved a one-time qualitative gain from being able to accumulate knowledge".
  • 3.3: How returns on speed (serial causal depth) contrast with returns from parallelism; how faster thought seems to contrast with more thought.  Whether sensing and manipulating technologies are likely to present a bottleneck for faster thinkers, or how large of a bottleneck.
  • 3.4 How human populations seem to scale in problem-solving power; some reasons to believe that we scale inefficiently enough for it to be puzzling.  Garry Kasparov's chess match vs. The World, which Kasparov won.
  • 3.5 Some inefficiencies that might cumulate in an estimate of humanity's net computational efficiency on a cognitive problem.
  • 3.6 What the anthropological record actually tells us about cognitive returns on cumulative selection pressure, given that selection pressures were probably increasing over the course of hominid history.  How the observed history would be expected to look different, if there were in fact diminishing returns on cognition.
  • 3.7 How to relate the curves for evolutionary difficulty, human-engineering difficulty, and AI-engineering difficulty, considering that they are almost certainly different.
  • 3.8 Correcting for anthropic bias in trying to estimate the intrinsic 'difficulty 'of hominid-level intelligence just from observing that intelligence evolved here on Earth.
  • 3.9 The question of whether to expect a 'local' (one-project) FOOM or 'global' (whole economy) FOOM and how returns on cognitive reinvestment interact with that.
  • 3.10 The great open uncertainty about the minimal conditions for starting a FOOM; why I. J. Good's postulate of starting from 'ultraintelligence' is probably much too strong (sufficient, but very far above what is necessary).
  • 3.11 The enhanced probability of unknown unknowns in the scenario, since a smarter-than-human intelligence will selectively seek out and exploit flaws or gaps in our current knowledge.

4:  A tentative methodology for formalizing theories of the intelligence explosion - a project of formalizing possible microfoundations and explicitly stating their alleged relation to historical experience, such that some possibilities can allegedly be falsified.

5:  Which open sub-questions seem both high-value and possibly answerable.

6:  Formally poses the Open Problem and mentions what it would take for MIRI itself to directly fund further work in this field.

New Comment
246 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Agreeing with several other people that the introduction needs a major rewrite or possibly just a cut. Consider the opening sentence:

Isadore Jacob Gudak, who anglicized his name to Irving John Good and used I. J. Good for publication

Dude, no. Who gives a toss how he anglicised his name? Get to your point, if you have one.

Somewhat similarly, in the fourth paragraph, you have

Please note that...

Please note that the phrase "please note that" is unnecessary; it adds length and the impression that you are snippily correcting someone's blog comment, without adding any information (or politeness) to the sentence. I'm familiar with your argument about formal writing just adding a feeling of authority, but this isn't informality, it's sloppy editing.

Your whole first page, actually, is a pretty good demonstration of not having a point. I get the impression that you thought "Hmm, I need some kind of introduction" and went off to talk about something, anything, that wasn't the actual point of the paper, because the point belongs in the body and not the introduction. This makes for a page that adds nothing. You have a much better introduction starting with the parag... (read more)


Superficial stylistic remarks (as you'll see, I've only looked at about the first 1/4 of the paper):

  • The paper repeatedly uses the word "agency" where "agent" would seem more appropriate.

  • I agree with paper-machine that the mini-biography of I J Good has little value here.

  • The remark in section 1 about MIRI being funding-limited is out of place and looks like a whine or a plea for more money. Just take it out.

  • "albeit" on page 10, shortly before footnote 8, should just be "but". (Or maybe "even though", if that's your meaning.) [EDITED to add: there's another "albeit" that reads oddly to me, in footnote 66 on page 50. It's not wrong, but it feels odd. Roughly, wherever you can correctly write "albeit" you can equivalently write "even though", and that's a funny thing to be starting a footnote with.]

  • "criteria" in footnote 11 about paperclip maximizers should be "criterion".

  • In footnote 15 (about "g") the word "entrants" seems very weirdly chosen, and the footnote seems to define g as the observed correlation between different measures of intelligence, which

... (read more)

I agree with paper-machine that the mini-biography of I J Good has little value here.


The remark in section 1 about MIRI being funding-limited is out of place and looks like a whine or a plea for more money. Just take it out.


Just a thought on chess playing. Rather than looking at an extreme like Kasparov vs the world, it would be interesting to me to have teams of two, three, and four players of well-known individual ranking. These teams could then play many games against individuals and against each other. The effective ranking of the teams could be determined from their results. In this way, some sense of "how much smarter" a team is than the individual members could be determined. Ideally, the team would not be ranked until it had had significant experience playing as a team. We are interested in what a team could accomplish, and no strong reason to think it would take less time to optimize a team than to optimize an individual.

Along the same lines, teams could be developed to take IQ and other GI correlated tests to see how much smarter a few people together are than a single human. Would the results have implications for optimal AI design?

9Eliezer Yudkowsky
I think that teams of up to five people can scale "pretty well by human standards" - not too far from linearly. It's going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
As group size increases you have to spend more and more of your effort getting your ideas heard and keeping up with the worthwhile ideas being proposed by other people, as opposed to coming up with your own good ideas. Depending on the relevant infrastructure and collaboration mechanisms, it's fairly easy to have a negative contribution from each additional person in the project. If someone is trying to say something, then someone else has to listen - even if all the listener does is keep it from lowering the signal-to-noise ratio by removing the contribution.

You correctly describe the problems of coordinating the selection of the best result produced. But there's another big problem: coordinating the division of work.

When you add another player to a huge team of 5000 people, he won't start exploring a completely new series of moves no-one else had considered before. Instead, he will likely spend most of his time considering moves already considered by some of the existing players. That's another reason why his marginal contribution will be so low.

Unlike humans, computers are good at managing divide-and-conquer problems. In chess, a lot of the search for the next move is local in the move tree. That's what makes it a particularly good example of human groups not scaling where computers would.

That's parallelism for you. It's like the way that four-core chips are popular, while million-core chips are harder to come by.
I assume by 'linear' you mean directly proportional to population size. The diminishing marginal returns of some tasks, like the "wisdom of crowds" (concerned with forming accurate estimates) are well established, and taper off quickly regardless of the difficulty of the task---it's basically follows the law of large numbers and sample error (see "A Note on Aggregating Opinions", Hogarth, 1978). This glosses over some potential complexity but you're probably unlikely to do ever get much benefit from more than a few hundred people, if that many. Other tasks do not see such quickly diminishing returns, such as problem solving in a complex fitness landscape (see work on "Exploration and Exploitation" especially in NK space). Supposing the number of possible solutions to a problem to be much greater than the number of people feasibly working on the problem (e.g., the population of creative and engaged humans) then as the number of people increase, the probability of finding the optimal solution increases. Coordinating all those people is another issue, as is the potential opportunity cost of having so many people work on the same problem. However, in my experience, this difference between problem-solving and wisdom-of-crowds tasks is often glossed over in collective intelligence research.
Regarding the apparent non-scaling benefits of history: what you call the "most charitable" explanation seems to me the most likely. Thousands of people work at places like CERN and spend 20 years contributing to a single paper, doing things that simply could not be done by a small team. Models of problem-solving on "NK Space" type fitness landscapes also support this interpretation: fitness improvements become increasingly hard to find over time. As you've noted elsewhere, it's easier to pluck low-hanging fruit.
Are you or anyone else aware of any work along these lines, showing the intelligence of groups of people? Any sense of what the intelligence of the planet as a whole, or the largest effective intelligence of any group on the planet might be? If groups of up to 5 scale well, and we get sublinear returns above 5, but positive returns up to some point anyway, does this prove that AI won't FOOM until it has an intelligence larger than the largest intelligence of a group of humans? That is, until AI has a higher intelligence than the group, that the group of humans will dominate the rate at which new AI's are improved?
There is the MIT Center for Collective Intelligence.
Update: this is a pretty large field of research now. The Collective Intelligence Conference is going into its 7th year.
As far as empirically finding the optimum group size, it'd be cheaper to find the number of researchers in a scientific sub-discipline and measure the productive work they do in that field. They are teams that review work for general distribution, read on others' progress, and contribute to the discussion. Larger sub-fields that would be more efficient divided up would have large incentives to do so, as defectors to the sub-sub-field would have higher productivity (and less irrelevant work to read up on).
Does anyone play (rated) chess on If so, do you want to get together to play some team games for the purposes of adding hard data to this discussion? My blitz rating is in the high 1200s. My teammate should have a blitz rating close to that to make the data valuable. I play 8-minute games, and am not interested in playing enough non-blitz games to get my rating to be an accurate reflection of my (individual) skill. (Non-blitz games would take too much time and take too much out of me. "Non-blitz" games are defined as games with at least 15 minutes on the clock for each player.) I envision the team being co-located while playing, which limits my teammate to someone who is or will be in San Francisco or Berkeley. I've played a little "team chess" before. Was a lot of fun. My contact info is here.

Having looked through the document again, I feel that a competent technical writer, or anyone with a paper-writing experience, can make this report into a paper suitable for submission within a couple of days, maybe a week, assuming MIRI wants it published. A lot would have to be cut, and the rest rearranged and tidied up, but there is definitely enough meat there for a quality paper or two. I am not sure what MIRI's intention is re this report, other than "hope that the open problems posed therein inspire further work by economists or economically literate modelers".

  • Page 4: the sum log(w) + log(log(w)) + ... doesn't converge. Some logarithm will be negative and then the next one will be undefined. Presumably you meant to stop the sum once it becomes negative, but then I'm somewhat confused about this argument because I'm not sure it's dimensionally consistent (I'm not sure what units cognitive work is being measured in).

  • Top of page 18: there's a reference to "this graph" but no graph...?

General comment 1: who's the intended audience here? Most of the paper reads like a blog post, which I imagine could be disconcerting for newcomers trying to evaluate whether they should be paying attention to MIRI and expecting a typical research paper from a fancy-looking .pdf.

General comment 2: I still think this discussion needs more computational complexity. I brought this up to you earlier and I didn't really digest your reply. The question of what you can and can't do with a given amount of computational resources seems highly relevant to understanding what the intelligence explosion could look like; in particular I would be surprised if questions like P vs. NP didn't have a strong bearing on the distribution over timelines (I expect that the faster it is possible to solve NP-complete problems, which apparently includes protein folding, the faster AI could go foom). But then I'm not a domain expert here and I could be off-base for various reasons.

IIRC, Daniel Dewey said at the April workshop that this is, in fact, his current project.
The reference to "this graph" is a hyperlink. There are many such hyperlinks in the document. They feel rather weird, and are easy to miss, given the generally print-like typesetting. It might be worth writing them all out in something like the form one would use in an ordinary printed document, while preserving their hyperlinkiness.
1Eliezer Yudkowsky
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.

The physical universe doesn't need to "solve" protein folding in the sense of having a worst-case polynomial-time algorithm. It just needs to fold proteins. Many NP-complete problems are "mostly easy" with a few hard instances that rarely come up. (In fact, it's hard to find an NP-complete problem for which random instances are hard: if we could do this, we would use it for cryptography.) It's reasonable to suppose protein folding is like this.

Of course, if this is the case, maybe the AI doesn't care about the rare hard instances of protein folding, either.

If we have an NP-complete problem for which random instances are hard, but we can't generate them with solutions, that doesn't help cryptography.
Proteins are finite in length. Why would nature care if it can't do something in polynomial time? Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?

can we put an upper bound on the length of proteins using evolutionary time scales?

Not really, most big proteins consist of 'domains' which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line... over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.

Actually, the lab I'm working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted - one domain didn't fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn't work despite having exactly the same sequence as the wild protein.

I think the more important point is that Nature doesn't care about the worst case (if a protein takes forever to fold correctly then it's not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.

6Eliezer Yudkowsky
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty. The thing with such problems is that you don't solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem. I think at this point, taboo "solved a hard problem". Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn't have to. Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible. (The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there's a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that's no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I'm up to date on protein folding.) In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn't be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate). However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the "secure hash" scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computat

In the worst case, the protein folding could be a secure hash

Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.

This isn't a minor quibble, it's a major reason to go "Whaa?" at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn't traverse that space at all.


This isn't a minor quibble, it's a major reason to go "Whaa?" at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn't traverse that space at all.

Yep, it's highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that's before we look at how evolution works.

Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).

That said, the "worst possible case" is usually interesting.

Isn't that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren't.
Lets go! From Wik: Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme. So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let's call it a week of computation total. There's 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops. They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn't add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that. Don't know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops. Let's call it 1e25 flops with current techniques to design nanotech. If AI is in 20 years, that's 13 moores doublings or 1e4, then let's say the AI can seize a network of as much computational power as they used, plus moores scaling. So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network. So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today's algorithms. Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes. The AI will probably do better than that, but that would be good enough to fuck us. This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used) Let's get this thing right the first time.... EDIT: an interesting property of exponential processes is that things go from "totally impossible" to "trivial" very quickly.
Note that by these estimates, humans should be able to have nano around 2030. Scary stuff.
I can feel some inferential distance here that isn't successfully being bridged. It's far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
3Eliezer Yudkowsky
I guess to me the notion of "solve an NP-hard problem" (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic - all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it - so I feel a bit at a loss to relate to the question. It's like bringing up PSPACE completeness - I feel a sense of 'where did that come from?' and find it hard to think of what to say, except for "Nothing that's happened so far could've been PSPACE complete."
Agreed if you mean "Nothing that's happened so far could've been [computationally hard] to predict given the initial conditions." But the reverse problem -- finding initial conditions that produce a desired output -- could be very hard. Nature doesn't care about this, but an AI plausibly might. I'm not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement "P=NP" is more or less equivalent to "The reverse problem I mention above is always easy." Things become very different if this is true.
Here's a paper claiming that the most widely studied model of protein folding in 1998 is NP-Complete. I don't know enough about modern research into protein folding to comment how applicable that result still is. My guess is you're referring to Aaronson's paper, which doesn't seem relevant here. The universe doesn't solve NP-hard problems in P time, but the universe took NP time to build the first useful proteins, didn't it?

The solution space is large enough that even proteins sampling it's points at a rate of trillions per second couldn't really fold if they were just searching randomly through all possible configurations, that would be NP complete. They don't actually do this of course. Instead they fold piece by piece as they are produced, with local interactions forming domains which tend to retain their approximate structure once they come together to form a whole protein. They don't enter the lowest possible energy state therefore. Prion diseases are an example of what can happen when proteins enter a normally inaccessible local energy minimum, which in that case happens to have a snowballing effect on other proteins.

The result is that they follow grooves in the energy landscape towards an energy well which is robust enough to withstand all sorts of variation, including the horrific inaccuracies of our attempts at modeling. Our energy functions are just very crude approximations to the real one, which is dependent on quantum level effects and therefore intractable. Another issue is that proteins don't fold in isolation - they interact with chaperone proteins and all sorts of other crap. So simu... (read more)

Right, this gets called Levinthal's Paradox.
Right. I'm actually not sure how relevant it all is to discussions of an A.I. trying to get arbitrary things done with proteins. Folding an existing protein may be a great deal easier than finding a protein which folds into an arbitrary shape. Probably not all shapes are allowed by the physics of the problem. Evolution can't really be said to solve that problem either. It just produces small increments in fitness. Otherwise organisms' proteomes would be a lot more efficient. Although on second thought, an A.I. would probably just be able to design a protein with a massively low energy funnel, so that even if it couldn't simulate folding perfectly, it could still get things done. Regardless, an imperfect solution would probably suffice for world domination...
8Eliezer Yudkowsky
I believe I've seen that discussed before and the answer is just that in real life, proteins don't fold into the lowest-energy conformation. It's like saying that the global minimum energy for soap bubbles is NP-complete. Finding new useful proteins tends to occur via point mutations so that can't be NP-complete either.
So, I can think of several different things that could all be the 'protein folding problem.' 1. Figure out the trajectory an unfolded protein takes towards a folded protein, with a known starting state. (P) 2. Given a known folded protein, find local minima that unfolded proteins starting with random start states might get stuck in. (NP, probably) 3. Given a desired chemical reaction, find a folded protein that will catalyze the reaction. (Not sure, probably NP.) 4. Given a desired chemical reaction, find a folded protein that will catalyze the reaction that is the local minimum reached by most arbitrary unfolded positions for that protein (Optimal is definitely NP, but I suspect feasible is too.) 5. Others. (Here, the closest I get to molecular nanotech is 'catalyze reactions,' but I imagine the space for 'build a protein that looks like X' might actually be smaller.) It looks to me like the problems here that have significant returns are NP. It's not at all clear to me what you mean by this. I mean, take the traveling salesman problem. It's NP-Hard*, but you can get decent solutions by using genetic algorithms to breed solutions given feasible initial solutions. Most improvements to the route will be introduced by mutations, and yet the problem is still NP-hard. That is, it's not clear to me that you're differentiating between the problem of finding an optimal solution being NP hard, it taking NP time to find a 'decent' solution, and an algorithm requiring NP time to finish running. (The second is rarely true for things like the traveling salesman problem, but is often true for practical problems where you throw in tons of constraints.) * A variant is NP-Complete, which is what I originally wrote.
1Eliezer Yudkowsky
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period. It could be a physical event that you choose to regard as a P-approximation to a theoretical problem whose optimal solution would be NP-hard, but so what, that wouldn't have anything to do with what physically happened. It would take unknown, exotic physics to have anything NP-hard physically happen. Anything that could not plausibly have involved black holes rotating at half the speed of light to produce closed timelike curves, or whatever, cannot have plausibly involved NP-hard problems. NP-hard = "did not physically happen". "Physically happened" = not NP-hard.

Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period.

I've seen you say this a couple of times, and your interlocutors seem to understand you, even when they dispute your conclusion. But my brain keeps returning an error when I try to parse your claim.

Read literally, "NP-hard" is not a predicate that can be meaningfully applied to individual events. So, in that sense, trivially, nothing that happens (physically or otherwise, if "physically" is doing any work here) can be NP-hard. But you are evidently not making such a trivial claim.

So, what would it look like if the physical universe "solved an NP-hard problem"? Presumably it wouldn't just mean that some actual salesman found a why to use existing airline routes to visit a bunch of pre-specified cities without revisiting any one of them. Presumably it wouldn't just mean that someone built a computer that implements a brute-force exhaustive search for a solution to the traveling salesman problem given an arbitrary graph (a search that the computer will never finish before the heat death of the universe if the example is large). But I can't think of any other interpretation to give to your claim.

ETA: this is a side point. Here's Scott Aaronson describing people (university professors in computer science and cognitive science at RPI) who claim that the physical universe efficiently solves NP-hard problems: In other news, Bringsjord also claims to show by a modal argument, similar to the theistic modal argument (which he also endorses), that human brains are capable of hypercomputation: "it's possible humans are capable of hypercomputation, so they are capable of hypercomputation." For this reason he argues that superhumanly intelligent Turing machines/Von Neumann computers are impossible and belief in their possibility is fideistic.
This doesn't refute what you are responding to. Saying the universe can't solve a general NP problem in polynomial time is not the same thing as saying the universe cannot possibly solve specific instances of generally NP-complete problems, which is Tyrrell_McAllister's point, as far as I can parse. In general, the traveling salesman is NP-complete, however there are lots of cases where heuristics get the job done in polynomial time, even if those heuristics would run-away if they were given the wrong case. To use Aaronson's soap bubbles, sometimes the soap bubble finds a Steiner tree, sometimes it doesn't. When it DOES, it has solved one instance of an NP-complete problem fairly quickly.
I agree with your parse error. It looks like EY has moved away from the claim made in the grandparent, though.

That seems a little strongly put - NP-hard scales very poorly, so no real process can take N up to large numbers. I can solve the traveling salesman problem in my head with only modest effort if there are only 4 stops. And it's trivial if there are 2 or 3 stops.

Um... doesn't it take exponential time in order to simulate quantum mechanics on a classical computer?
Yes (At least that's the general consensus among complexity theorists, though it hasn't been proved.) This doesn't contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved: P⊂BQP⊂EXP P⊂NP⊂EXP BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.) If you don't measure any distinctions finer than P vs EXP, then you're using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren't.
Is that just that we don’t know any better algorithms, or is there a proof that exptime is needed?
I really don't know; some Wikipedia browsing suggests that there's a proof, but I'd rather have a statement from someone who actually knows.
I don't understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers -- just not on large instances of the problem. New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine. If you want to argue that actual practical biological folding of proteins isn't NP-hard, the argument can't start from "it happens quickly" -- you need to say something about how the time to fold scales with the length of the amino acid strings, and in particular in the limit for very large strings. Similarly, I don't see why biological optimization couldn't have solved hard cases of NP-compete problems. If you wait long enough for evolution to do its thing, the result could be equivalent to an exhaustive search. No new physics required.
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.) The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn't settled yet. (In other words, I am suggesting that "just fine" is an something of an overstatement when it comes to solving seriously difficult problems by brute force.)
That counterargument is a bit too general, since it applies not only to NP problems, but even to P problems (such as deciding whether a number is the GCD of two other numbers), or even any arbitrary algorithm modified by a few lines of codes such that its result is unaffected, merely delayed until after the stars burned out, or whatever limit we postulate. For NP problems and e.g. P problems both, given how we understand the universe, there is only a finite number of inputs in both cases which are tractable, and an infinite number of inputs which aren't. Though the finite number is well different for both, as a fraction of all "possible", or rather well-defined (let's avoid that ambiguity cliff) inputs, it would be the same. Cue "We all live in a Finite State Machine, Finite State Machine, Finite State Machine ..."
The point can't be confined to "trivial instances". For any NP-complete problem on some reasonable computing platform that can solve small instances quickly, there will be instance sizes that are non-trivial (take appreciable time to solve) but do not require eons to solve. There is absolutely no mathematical reason for assuming that for "natural" NP-complete problems, interesting-sized instances can't be solved on a timescale of months/years/centuries by natural processes. The dichotomy between "trivial" and "impossible to solve in a useful time-frame" is a false one.
Presumably quantum suicide is a part of "whatever".
It occurs to me that we can't really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended. Imagine you were living in a game, and trying to judge the game's hardware requirements. If you did that by looking at a clock in the game, you'd need to assume that the clock is synchronized to the actual system time. If you had a counter you increased, you wouldn't be able to say from inside the program every which step you get to that counter++ instruction. The problem being that we don't have access to anything external, we aren't watching the Turing Machine compute, we are inside the Turing Machine "watching" the effects of other parts of the program, such as a folding protein (observing whenever it's our turn to be simulated). We don't, however, see the Turing Machine compute, we only see the output. The raw computing power / requirements "behind the scenes", even if such a behind the scenes is only a non-existent abstraction, is impossible to judge with certainty, similar to a map-territory divide. Since there is no access in principle, we cannot observe anything but the "output", we have no way of verifying any assumptions about a correspondence between "game timer" and "system timer" we may make, or of devising any experiments. Even the recently linked "test the computational limits" doesn't break the barrier, since for all we know the program may stall, and the next "frame" it outputs may still seem consistent, with no stalling, when viewed from inside the program, which we are. We wouldn't subjectively realize the stall. If such an experiment did find something, it would be akin to a bug, not to a measurement of computational resources expended. Back to diapers.
That's a valid point, but it does presuppose exotic new physics to make that substrate, in which "our" time passes arbitrarily slowly compared to the really real time, so that it can solve NP-hard problems between our clock ticks. We would, in effect be in a simulation. Evidence of NP-hard problems actually being solved in P could be taken as evidence that we are in one.
If we assume that protein folding occurs according to the laws of quantum mechanics, then it shouldn't tell us anything about the computational complexity of our universe besides what quantum mechanics tells us, right?
Well, yea that's what I'm leaning towards. The laws of physics themselves need not govern the machine (Turing or otherwise), they are effects we observe, us being other effects. The laws of physics and the observers both are part of the output. Like playing an online roleplaying game and inferring what the program can actually do or what resources it takes, when all you can access is "how high can my character jump" and other in-game rules. The rules regarding the jumping, and any limits the program chose to confer to the jumping behavior are not indicative of the resource requirements and efficiency of the underlying system. Is calculating the jumping easy or hard for the computer? How would you know as a character? The output, again, is a bad judge, take this example: Imagine using an old Intel 386 system which you rigged into running the latest FPS shooter. It may only output one frame every few hours, but as a sentient character inside that game you wouldn't notice. Things would be "smooth" for you because the rules would be unchanged from your point of view. We can only say that given our knowledge of the laws of physics, the TM running the universe doesn't output anything which seems like an efficient NP-problem solver, whether the program contains one, or the correct hardware abstraction running it uses one, is anyone's guess. (The "contains one" probably isn't anyone's guess because of Occam's Razor considerations.) If this is all confused (it may well be, was mostly a stray thought), I'd appreciate a refutation.
If I understand correctly you're saying that what is efficiently computable within a universe is not necessarily the same as what is efficiently computable on a computer simulating that universe. That is a good point.
Exactly. Thanks for succinctly expressing my point better than I could. The question is whether assuming a correspondence as a somewhat default case (implied by the "not necessarily") is even a good default assumption. Why would the rules inherent in what we see inside the universe be any more indicative of the rules of the computer simulating that universe than the rules inside a computer game are reflective of the instruction set of the CPU running it (they are not)? I am aware that the reference class "computer running super mario brother / kirby's dream land" implies for the rules to be different, but on what basis would we choose any reference class which implies a correspondence? Also, I'm not advocating simulationism with this per se, the "outer" computer can be strictly an abstraction.
This does not follow. It may be that finding new useful proteins just takes a very long time and is very inefficient. The rest of your comment seems correct though.
4Eliezer Yudkowsky
Then evolution wouldn't happen in real life. Actually, even that understates the argument. If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard - you only had to search through 19 variants on each of 20,000 cases.
Huh? How does this argument work? That doesn't mean that evolution can't happen in real life, it would be a reason to think that evolution is very slow (which it is!) or that evolution is missing a lot of interesting proteins (which seems plausible). I'm not sure I follow your logic. Are you arguing that because log 20,000 <19? Yes, you can check every possible position in a base sequence this way, but there are still a lot more proteins than those 19. One doesn't get something special from just changing a specific base. Moreover, even if something interesting does happen for changing a specific one, it might not happen if one changes some other one.
Definitely, since evolution keeps introducing new interesting proteins. But it's not slow on a scale of e^n for even modestly large n. If you can produce millions of proteins with hundreds to thousands of amino acids in a few billion years, then approximate search for useful proteins is not inefficient like finding the lowest-energy conformation is (maybe polynomial approximation, or the base is much better, or functional chunking lets you effectively reduce n greatly...).
Wait, the fact the evolution is often introducing a interesting new proteins is evidence that evolution is missing a lot of interesting proteins? How does that follow? Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?

Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?


That would be evidence that the supply of interesting proteins had been exhausted, just as computer performance at tic-tac-toe and checkers has stopped improving. I don't see where you're coming from here.

Because evolution can't get stuck in the domain of attraction of a local optimum? It always finds any good points? Edit to add: Intelligent humans can quickly refactor their programs out of poor regions of designspace. Evolution must grope within its neighborhood. 2nd Edit: How about this argument: "Evolution has stopped producing interesting new ways of flying; therefore, there are probably no other interesting ways of accomplishing flight, since after all, if there were a good way of doing it, evolution would find it."
Point mutations aren't the only way for new things to be produced. You can also recombine large chunks and domains together from multiple previous genes. Hell, there are even examples of genes evolving via a frame-shift that knocks the 3-base frame of a gene off by one producing a gobbeldygook protein that selection then acts upon...
Carl wasn't commenting on whether it would be very strong evidence but whether it would be evidence.
Yes, we can be definitely confident that there are more interesting proteins in the vicinity because of continuing production. We have less evidence about more distant extrapolations, although they could exist too.
That makes a lot more sense. It's just that, from the context, you seemed to be making a claim about evolution's ability to find all cool proteins, rather than just the ones within organisms' local search neighborhood (which would thus be within evolution's reach). Hence why you appeared, from my reading, to be making the common mistake of attributing intelligence (and global search capabilities) to evolution, which it definitely does not have. This insinuation was compounded by your comparison to human-intelligence-designed game algorithms, further making it sound like you attributed excessive search capability to evolution. (And I'm a little scared, to be honest, that the linked comment got several upvotes.) If you actually recognize the different search capabilities of evolution version more intelligent algorithms, I suggest you retract, or significantly revise, the linked comment.
-9Eliezer Yudkowsky
For various values of "solve". NP-hard problems may not be analytically solvable, but numerical approximations can get pretty darn close, certainly enough for evolutionary advantage.
Is this your complete response? I guess I could expand this to "I expect all the problems an AI needs to solve on the way to an intelligence explosion to be easy in principle but hard in practice," and I guess I could expand your other comments to "the problem sizes an AI will need to deal with are small enough that asymptotic statements about difficulty won't come into play." Both of these claims seem like they require justification.
0Eliezer Yudkowsky
It's not meant as a response to everything, just noting that protein structure prediction can't be NP-hard. More generally, I tend to take P!=NP as a background assumption; I can't say I've worried too much about how the universe would look different if P=NP. I never thought superintelligences could solve NP-hard problems to begin with, since they're made out of wavefunction and quantum mechanics can't do that. My model of an intelligence explosion just doesn't include anyone trying to do anything NP-hard at any point, unless it's in the trivial sense of doing it for N=20 or something. Since I already expect things to local FOOM with P!=NP, adding P=NP doesn't seem to change much, even if the polynomial itself is small. Though Scott Aaronson seems to think there'd be long-term fun-theoretic problems because it would make so many challenges uninteresting. :)

Some review notes as I go through it (at a bright dilettante level):

Section 1:

  • I wonder if the chain-reaction model is a good one for recursive self-improvement, or is it just the scariest one? What other models have been investigated? For example, the chain-reaction model of financial investment would result in a single entity with the highest return rate dominating the Earth, this has not happened yet, to my knowledge.

Section 1.3:

  • There was a recent argument here by David Pearce, I think, that an intelligent enough paperclip maximizer will have to self-modify to be more "humane". If I recall correctly, the logic was that in the process of searching the space of optimization options it will necessarily encounter an imperative against suffering or something to that effect, inevitably resulting in modifying its goal system to be more compassionate, the way humanity seems to be evolving. This would restrict the Orthogonality Thesis to the initial takeoff, and result in goal convergence later on. While this seems like wishful thinking, it might be worth addressing in some detail, beyond the footnote 11.

Chapter 2:

  • log(n) + log(log(n)) + ... seems to describe we
... (read more)

For example, the chain-reaction model of financial investment would result in a single entity with the highest return rate dominating the Earth, this has not happened yet, to my knowledge.

Like... humans? Or the way that medieval moneylenders aren't around anymore, and a different type of financial organization seems to have taken over the world instead? See also the discussion of China catching up to Australia.

Fair point about human domination. Though I'm not sure how it fits into the chain reaction model. Maybe reinvestment of knowledge into more knowledge does, not intelligence into more intelligence. As for financial investments, I don't know of any organization emerging as a singleton.

If I recall correctly, the logic was that in the process of searching the space of optimization options it will necessarily encounter an imperative against suffering or something to that effect, inevitably resulting in modifying its goal system to be more compassionate, the way humanity seems to be evolving.

I see no reason to suspect the space of optimization options contains value imperatives, assuming the AI is guarded against the equivalent of SQL injection attacks.

Humanity seems to be evolving towards compassion because being the causal factors increasing compassion are on average profitable for individual humans with those factors. The easy example of this is stable, strong police forces routinely hanging murderers, instead of those murderers profiting from from their actions. If you don't have an analogue of the police, then you shouldn't expect the analogue of the reduction in murders.

(I should remark that I very much like the way this report is focused; I think that trying to discuss causal models explicitly is much better than trying to make surface-level analogies.)

  • empty space for a meditation seems out of place in a more-or-less formal paper

At the very least, using a page break rather than a bunch of ellipses seems better.

I was simply paraphrasing David Pearce, it's not my opinion, so no point arguing with me. That said, your argument seems misdirected in another way: the imperative against suffering applies to people and animals whose welfare is not in any way beneficial and sometimes even detrimental to those exhibiting compassion.
Yeah, but they are losing compassion for other things (unborn babies, gods, etc...). What reason is there to believe there is a net gain in compassion, rather than simply a shift in the things to be compassionate towards? EDIT: This should have been directed towards Vaniver rather than shminux.
an expanding circle of empathetic concern needn't reflect a net gain in compassion. Naively, one might imagine that e.g. vegans are more compassionate than vegetarians. But I know of no evidence this is the case. Tellingly, female vegetarians outnumber male vegetarians by around 2:1, but the ratio of male to female vegans is roughly equal. So an expanding circle may reflect our reduced tolerance of inconsistency / cognitive dissonance. Men are more likely to be utilitarian hyper-systematisers.
Does your source distinguish between motivations for vegetarianism? It's plausible that the male:female vegetarianism rates are instead motivated by (e.g.) culture-linked diet concerns -- women adopt restricted diets of all types significantly more than men -- and that ethically motivated vegetarianism occurs at similar rates, or that self-justifying ethics tend to evolve after the fact.
Nornagest, fair point. See too "The Brain Functional Networks Associated to Human and Animal Suffering Differ among Omnivores, Vegetarians and Vegans" :
Right. What I should have said was:
The growth of science has led to a decline in animism. So in one sense, our sphere of concern has narrowed. But within the sphere of sentience, I think Singer and Pinker are broadly correct. Also, utopian technology makes even the weakest forms of benevolence vastly more effective. Consider, say, vaccination. Even if, pessimistically, one doesn't foresee any net growth in empathetic concern, technology increasingly makes the costs of benevolence trivial. [Once again, I'm not addressing here the prospect of hypothetical paperclippers - just mind-reading humans with a pain-pleasure (dis)value axis.]
Would this be the same Singer who argues that there's nothing wrong with infanticide?
On (indirect) utilitarian grounds, we may make a strong case that enshrining the sanctity of life in law will lead to better consequences than legalising infanticide. So I disagree with Singer here. But I'm not sure Singer's willingness to defend infanticide as (sometimes) the lesser evil is a counterexample to the broad sweep of the generalisation of the expanding circle. We're not talking about some Iron Law of Moral Progress.
If I recall correctly Singer's defense is that it's better to kill infants than have them grow up with disabilities. The logic here relies on excluding infants and to a certain extent people with disabilities from our circle of compassion. You may want to look at gwern's essay on the subject. By the time you finish taking into account all the counterexamples your generalization looks more like a case of cherry-picking examples.
Eugine, are you doing Peter Singer justice? What motivates Singer's position isn't a range of empathetic concern that's stunted in comparsion to people who favour the universal sanctity of human life. Rather it's a different conception of the threshold below which a life is not worth living. We find similar debates over the so-called "Logic of the Larder" for factory-farmed non-human animals: Actually, one may agree with Singer - both his utilitarian ethics and bleak diagnosis of some human and nonhuman lives - and still argue against his policy prescriptions on indirect utilitarian grounds. But this would take us far afield.
By this logic most of the people from the past who Singer and Pinker cite as examples of less empathic individuals aren't less empathic either. But seriously, has Singer made any effort to take into account, or even look at, the preferences of any of the people who he claims have lives that aren't worth living?
I disagree with Peter Singer here. So I'm not best placed to argue his position. But Singer is acutely sensitive to the potential risks of any notion of lives not worth living. Recall Singer lost three of his grandparents in the Holocaust. Let's just say it's not obvious that an incurable victim of, say, infantile Tay–Sachs disease, who is going do die around four years old after a chronic pain-ridden existence, is better off alive. We can't ask this question to the victim: the nature of the disorder means s/he is not cognitively competent to understand the question. Either way, the case for the expanding circle doesn't depend on an alleged growth in empathy per se. If, as I think quite likely, we eventually enlarge our sphere of concern to the well-being of all sentience, this outcome may owe as much to the trait of high-AQ hyper-systematising as any widening or deepening compassion. By way of example, consider the work of Bill Gates in cost-effective investments in global health (vaccinations etc) and indeed in: ("the future of meat is vegan"). Not even his greatest admirers would describe Gates as unusually empathetic. But he is unusually rational - and the growth in secular scientific rationalism looks set to continue.
I'm not sure what you mean by "sensitive", it certainly doesn't stop him from being at the cutting edge pushing in that direction. You seem to be confusing expanding the circle of beings we care for and being more efficient in providing that caring.
Cruelty-free in vitro meat can potentially replace the flesh of all sentient beings currently used for food. Yes, it's more efficient; it also makes high-tech Jainism less of a pipedream.
As I understand the common arguments for legalizing infanticide, it involves weighting the preferences of the parents and society more - not a complete discounting of the infant's preferences.
Try replacing "infanticide" (and "infant's") in that sentence with "killing Jews" or "enslaving Blacks". Would you also argue that it's not excluding Jews or Blacks from the circle of compassion?
It seems like a silly question. Practically everyone discounts the preferences of the very young. They can't vote, and below some age, are widely agreed to have practically no human rights, and are generally eligible for death on parental whim.
Well the same applies even more strongly to animals, but the people arguing for the "expanding circle of compassion" idea like to site vegetarianism as an example of this phenomenon.
Well, sure, but adult human females have preferences too, and they are quite significant ones. An "expanding circle of compassion" doesn't necessarily imply equal weights for everyone.
So did slave owners. At the point where A's inconvenience justifies B's being killed you've effectively generalized the "expanding circle of compassion" idea into meaninglessness.
Sure. Singer's obviously right about the "expanding circle" - it's a real phenomenon. If A is a human and B is a radish, A killing B doesn't seem too awful. Singer claims newborns are rather like that - in being too young to have much in the way of preferences worthy of respect.
Um, this is precisely the point of disagreement, and given that your next sentence is about the position that babies have the moral worth of radishes I don't see how you can assert that with a straight face.
I didn't know that. I normally take this for granted. Some conventional cites on the topic are: Singer and Dawkins.
I find it really weird that I don't recall having seen that piece of rhetoric before. (ETA: Argh, dangerously close to politics here. Retracting this comment.)
1Eliezer Yudkowsky
I wish I could upvote your retraction.

The closest thing I have seen to this sort of idea is this:

Wow, an excellent essay! If I remember correctly, I started thinking along these lines after hearing Robert Garland lecture on ancient Egyptian religion. As a side-note to a discussion about how they had little sympathy for the plight of slaves and those in the lower classes of society (since this was all part of the eternal cosmic order and as it should be), he mentioned that they would likely think that we are the cruel ones, since we don't even bother to feed and cloth the gods, let alone worship them (and the gods, of course, are even more important than mere humans, making our lack of concern all the more horrible).
Any idea where Garland might've written that up? All the books listed in your link sound like they'd be on Greece, not Egypt.
It was definitely a lecture, not a book. Maybe I'll track it down when I get around to Ankifying my Ancient Egypt notes.
It seems beneficial to make sure my understanding of why Pearce's argument fails matches that of others, even if I don't need to convince you that it fails. I interpret imperatives as "you should X," where the operative word is the "should," even if the content is the "X." It is not at all obvious to me why Pearce expects the "should" to be convincing to a paperclipper. That is, I don't think there is a logical argument from arbitrary premises to adopt a preference for not harming beings that can feel pain, even though the paperclipper may imagine a large number of unconvincing logical arguments whose conclusion is "don't harm beings that can feel pain if it costless to avoid" on the way to accomplishing its goals.
Perhaps it's worth distinguishing the Convergence vs Orthogonality theses for: 1) biological minds with a pain-pleasure (dis)value axis. 2) hypothetical paperclippers. Unless we believe that the expanding circle of compassion is likely to contract, IMO a strong case can be made that rational agents will tend to phase out the biology of suffering in their forward light-cone. I'm assuming, controversially, that superintelligent biological posthumans will not be prey to the egocentric illusion that was fitness-enhancing on the African savannah. Hence the scientific view-from-nowhere, i.e. no arbitrarily privileged reference frames. But what about 2? I confess I still struggle with the notion of a superintelligent paperclipper. But if we grant that such a prospect is feasible and even probable, then I agree the Orthogonality thesis is most likely true.
As mentioned elsewhere in this thread, it's not obvious that the circle is actually expanding right now.
This reads to me as "unless we believe conclusion ~X, a strong case can be made for X," which makes me suspect that I made a parse error. This is a negative statement: "synthetic superintelligences will not have property A, because they did not come from the savanna." I don't think negative statements are as convincing as positive statements: "synthetic superintelligences will have property ~A, because ~A will be rewarded in the future more than A." I suspect that a moral "view from here" will be better at accumulating resources than a moral "view from nowhere," both now and in the future, for reasons I can elaborate on if they aren't obvious.
There is no guarantee that greater perspective-taking capacity will be matched with equivalent action. But presumably greater empathetic concern makes such action more likely. [cf. Steven Pinker's "The Better Angels of Our Nature". Pinker aptly chronicles e.g. the growth in consideration of the interests of nonhuman animals; but this greater concern hasn't (yet) led to an end to the growth of factory-farming. In practice, I suspect in vitro meat will be the game-changer.] The attributes of superintelligence? Well, the growth of scientific knowledge has been paralleled by a growth in awareness - and partial correction - of all sorts of cognitive biases that were fitness-enhancing in the ancestral environment of adaptedness. Extrapolating, I was assuming that full-spectrum superintelligences would be capable of accessing and impartially weighing all possible first-person perspectives and acting accordingly. But I'm making a lot of contestable assumptions here. And see too the perils of:

- log(n) + log(log(n)) + ... seems to describe well the current rate of scientific progress, at least in high-energy physics

I'm going to commit pedantry: nesting enough logarithms eventually gives an undefined term (unless n's complex!). So where Eliezer says "the sequence log(w) + log(log(w)) + log(log(log(w))) will converge very quickly" (p. 4), that seems wrong, although I see what he's getting at.


It really bothers me that he calls it a sequence instead of a series (maybe he means the sequence of partial sums?), and that it's not written correctly.

The series doesn't converge because log(w) doesn't have a fixed point at zero.

It makes sense if you replace log(w) with log^+(w) = max{ log(w), 0 }, which is sometimes written as log(w) in computer science papers where the behavior on (0, 1] is irrelevant.

I suppose that amounts to assuming there's some threshold of cognitive work under which no gains in performance can be made, which seems reasonable.

0Eliezer Yudkowsky
Now fixed, I hope.
Oh yes. That makes far more sense. Thanks for fixing it.
2Eliezer Yudkowsky
Since this apparently bothers people, I'll try to fix it at some point. A more faithful statement would be that we start by investing work w, get a return w2 ~ log(w), reinvest it to get a new return log(w + w2) - log(w) = log ((w+w2)/w). Even more faithful to the same spirit of later arguments would be that we have y' ~ log(y) which is going to give you basically the same growth as y' = constant, i.e., whatever rate of work output you had at the beginning, it's not going to increase significantly as a result of reinvesting all that work. I'm not sure how to write either more faithful version so that the concept is immediately clear to the reader who does not pause to do differential equations in their head (even if simple ones).
Well, suppose cognitive power (in the sense of amount of cognitive work put unit time) is a function of total effort invested so far, like P=1-e^(-w). Then it's obvious that while dP/dw= e^(-w) is always positive, it rapidly decreases to basically zero, and total cognitive power converges to some theoretical maximum.
This is in the context of reinvesting dividends of cognitive work, assuming it takes exponentially greater investments to produce linearly greater returns. For example, maybe we get a return of log(X) cognitive work per time with what we have now, and to get returns of log(X+k) per time we need to have invested X+k cognitive work. What does it look like to reinvest all of our dividends? After dt, we have invested X+log(X) and our new return is log(X+log(X)). After 2dt, we have invested X+log(X)+log(X+log(X)), etc. The corrected paragraph would then look like: Except then it's not at all clear that the series converges quickly. Let's check... we could say the capital over time is f(t), with f(0)=w, and the derivative at t is f'(t)=log(f(t)). Then our capital over time is f(t)=li^(-1)(t+li(w)). This makes our capital / log-capital approximately linear, so our capital is superlinear, but not exponential.
I wonder if he meant w + (w+log(w)) + (w+log(w)+log(w+log(w))) + ...
I wonder if he meant w + (w+log(w)) + (w+log(w)+log(w+log(w))) + ...?

The discussion of the Yudkowsky-Hanson debate feels rather out of place. The points made are mostly highly relevant to the paper; the fact that they were made during an online debate is less so; the particular language used by either side in that debate still less. This discussion is also particularly informal and blog-post-like (random example: footnote 30), which may or may not be a problem depended on the intended audience for the paper.

I'd recommend major reworking of this section, still addressing the same issues but no longer so concerned with what each party said, or thought, or was happy to concede during that particular debate.

I am glad to see this report. I've felt that MIRI was producing less cool stuff than I would've expected, but this looks like it will go a long way towards addressing that. I am revising my opinion of the organization upwards. I look forward to reading this, and commit to having done so by the end of this weekend.


Given human researchers of constant speed, computing speeds double every 18 months.

Human researchers, using top-of-the-line computers as assistants. I get the impression this matters more for chip design than litho-tool design, but it definitely helps for those too.

Humans have around four times the brain volume of chimpanzees, but the difference between us is probably mostly software algorithms.

Is 'software algorithms' the right phrase? I'd characterize the improvements more as firmware or hardware improvements. [edit] Later you use the phrase "cognitive algorithms," which I'm much happier with.

A more concrete example you can use to replace the handwaving: one of the big programming productivity boosters is a second monitor, which seems directly related to low human working memory. It's easy to imagine minds with superior working memory able to handle much more complicated models and tasks. (We indeed seem to see this diversity among humans.)

In particular, your later arguments on serial causal depth seem like they would benefit from explicitly considering working memory as well as speed.

Any lab that shuts down overnight so its researchers can sleep must

... (read more)

It's easy to imagine minds with superior working memory able to handle much more complicated models and tasks. [..] In particular, your later arguments on serial causal depth seem like they would benefit from explicitly considering working memory

Strong, albeit anecdotal, agreement.

Working memory capacity was a large part of what my stroke damaged, and in colloquial terms I was just stupid, relatively speaking, until that healed/retrained. I was fine when dealing with simple problems, but add even a second level of indirection and I just wasn't able to track. The effect is at least subjectively highly nonlinear.

Incidentally, I think this is the strongest argument against Egan's General Intelligence Theorem (or, alternatively, Deutsch's "Universal Explainer" argument from The Beginning of Infinity). Yes, humans could in theory come up with arbitrarily complex causal models, and that's sufficient to understand an arbitrarily complex causal system, but in practice, unaided humans are limited to rather simple models. Yes, we're very good at making use of aids (I'm reminded of how much writing helps thinking whenever I try to do a complicated calculation in my head), but those limitations represent a plausible way for meaningful superhuman intelligence to be possible.

I hope never to forget the glorious experience of re-inventing the concept of lists, about two weeks into my recovery. I suddenly became indescribably smarter.

In the same vein, I have been patiently awaiting the development of artificial working-memory cognitive buffers. As you say, for practical purposes this is superhuman intelligence.

Gaaah. I hate brain damage. Congratulations on your discovery, anyway.
Yeah, you and me both, brother.
Indeed. For me, that was the most glaring conceptual problem. That and attempting to predict the course of evolution with minimal reference to evolutionary theory. There is a literature on how cultural systems evolve. For a specific instance see this:

I am unconvinced by the argument that H. sapiens can't be right at a limit on brain size because some people have larger-than-average heads without their mothers being dead.

Presumably head size is partly determined by environmental factors outside genetic control, and presumably having your mother die in childbirth is a really big disadvantage, much worse than being slightly less intelligent. If that's so, then what should it look like if we, as a species, are hard against that wall? (Which I take to mean that any overall increase in head size would be bad even if being cleverer is a big advantage.) I suggest we'd see head sizes that are far enough away from outright disaster that, even given that random environmental variation, death in childbirth is still pretty rare, but not completely unknown. And, of course, that's just what we see; death in childbirth is very rare now, in prosperous advanced countries, but if you go back 100 years or look in less fortunate parts of the world it's not so rare at all.

This could be quantified, at least kinda. We could look at how the frequency of death in childbirth, in places without modern medical care, varies with head size (though controllin... (read more)

Alternately, the selection could in reality be applied more strongly to helplessness of the infant or shape of the pelvis rather than head size at birth.
I get the impression that deaths from childbirth don't, usually, come about because the head is too large, but because there are other complications: Breech births, especially.

A very succinct summary appears in Wikipedia (2013):

Citing Wikipedia in any kind of academic context is generally a bad idea, even if it's just for a summary.

On page 15, you write:

the Moore’s-like law for serial processor speeds broke down in 2004

No citation is given, but I found one: Fuller & Millett (2011). The paper includes this handy graph:

And also this one:

The book behind the paper includes lots more detail. Its introduction is quite cheery:
I have an engineer friend who has recently put forward the idea that computing technology is approaching becoming a 'mature' technology, like the automobile in the 1950s. It gets a job done and does it well, every change made after that point is a matter of small incremental tweaks. Yeah you get twice the gas mileage now as you did then after a load of small changes with diminishing returns, but is it really all that different? Other friends of mine working as programmers have reacted favorably when I relayed this idea. Also, why should slower development of new applications for the computer industry kill the economy?

The discussion of Moore's law, faster engineers, hyperbolic growth, etc., seems to me to come close to an important point but not actually engage with it.

As the paper observes, a substantial part of the work of modern CPU design is already done by computers. So why don't we see hyperbolic rather than merely Moorean growth? One reason would be that as long as some fraction of the work, bounded away from zero, is done by human beings, you don't get the superexponential speedup, for obvious Amdahl-ish reasons. The human beings end up being the rate-limiting factor.

Now suppose we take human beings out of the loop entirely. Is the whole thing now in the hands of the ever-speeding-up computers? Alas, no. When some new technology is developed that enables denser circuitry, Intel and their rivals have to actually build the fabs before reaping the benefits by making faster CPUs. And that building activity doesn't speed up exponentially, and indeed its cost increases rapidly from one generation to the next.

There are things that are purely a matter of clever design. For instance, some of the increase in speed of computer programs over the years has come not from the CPUs but from the compiler... (read more)

Section 5:
The initial effort to get some numerical models going could be overestimated, unless such models have been done already. At the very least, a small-scale effort can pin-point the hard issues. This reminds me of the core-collapse Supernova modeling: it was reasonably easy to get the explosion modeled, except for the ignition by the initial shock wave. We still don't know what exactly makes them go FOOM. Most models predict a fizzle instead of an explosion. This is likely just a surface analogy, but it might well be that a few months of summer stud... (read more)

Two quick notes on the current text: Kasparov was apparently reading the forum of the opposing players during his chess victory in Kasparov vs. The World, which doesn't quite invalidate the outcome as evidence but does weaken the strength of evidence for cognitive (non-)scaling with population. Also Garett Jones made some relevant remarks here which I should've cited in the discussion of how science scales with scientific population and invested money (or rather, how it doesn't scale).

Anyone know the details of Karpov vs. The World? (Here are more GM vs. World games; most involve both sides using stronger-than-human chess engines.)

Here is my takeout from the report. It is not a summary, and some of the implications are mine.

The 4 Theses (conjectures, really):

  1. Inevitability of Intelligence (defined as cross-domain optimization power) Explosion due to recursive self-improvement
  2. Orthogonality (intelligence/rationality is preference-independent)
  3. Instrumental Convergence (most optimizers compete for the same resources)
  4. Complexity of Value (values are not easily formalizable, no 3 laws of robotics)
    if true, imply that AGI is an x-risk, because an AGI emerging in an ad hoc fashion wi
... (read more)

I'm also wondering about the estimated FOOM date of 2035 (presumably give or take a decade), is there an explicit calculation of it, and hopefully the confidence intervals as well?

0Eliezer Yudkowsky
Where does it say 2035 in the text? How did you get the impression that this was an estimation?
Maybe I misunderstood this passage:
4Eliezer Yudkowsky
Hm. Thanks for pointing that out. Maybe I should remove the specific dates from there and just say we were 45 years apart. I think in a lot of ways trying to time the intelligence explosion is a huge distraction. An important probability distribution, but still a huge distraction.

Well, you mentioned on occasion that this date affects your resource allocation between CFAR and MIRI, so it might be a worthwhile exercise to make the calculation explicit and subject to scrutiny, if not in the report, then some place else.

7Eliezer Yudkowsky
Fair point. We're still struggling to express things verbally, but yeah.

Thanks for adopting my suggestion to publish more on paperclip-production-relevant topics.

While I'm amused by your existence, the "novelty account" meme is quite virulent and has the potential to lower the signal-to-noise ratio in the comments if everyone starts doing this...
There are only so many contextually appropriate novelty names available. It wouldn't make much sense for someone to swoop in and begin posting in character as any random character. So far as I know, we've got Clippy, Voldemort and Quirinus Quirrel, and... one other, maybe? If people showed up and began noticeably posting in character as random vaguely (anti-)rationality-related characters (Spock, Kamina, Pinkie Pie), we'd have a problem. Fortunately, they don't.
Death A few others only wrote one or two comments each.
While I'm amused by your account name, the "novelty account" meme is quite virulent and has the potential to lower the signal-to-noise ratio in the comments if everyone starts doing this...
That's what we said when Clippy first started being ridiculous years ago. Luckily, it constrained itself to a few vanity posts, and the phenomena in general didn't really take off. Admittedly we're just mostly annoyed that people think we're such an account pretending to be an actual made-of-paper kind of machine, but so it goes.
What is the story behind the account name, then?
It's a favorite book of mine.


The first four or five paragraphs were just bloviation, and I stopped there.

I know you think you can get away with it in "popular education", but if you want to be taken seriously in technical discourse, then you need to rein in the pontification.

I don't disagree, but I also don't think this is correct. There are plenty of verbose mathematicians out there who spend too much time expounding on the philosophical merits of their approach, and they're taken seriously. You'll excuse me if I don't name names, though.

Most of those are people who have already earned it a bit by having major results to their credit.

Can you provide a few examples of sentences that you consider to be particularly empty of meaning?
The first three paragraphs. I think the fourth and fifth paragraphs (contra grandparent) are reasonable, but the biography of I. J. Good doesn't add much support to the thesis.

For lack of any huge problems discovered so far, moving this to Main.


It's 40,000 words, you say. How fast exactly do you expect any huge problems to be found?

If it's a huge problem like "I can't download" or "all these pages are blank" then relatively fast.


Hominid brain size has not been increasing for at least the past 100,000 years. In fact, the range is tighter and median is lower for homo sapiens vs homo neanderthalensis.

Given that information, how does this change your explanation of your data?

The most important brain developments in the genus have come during the time when brain size was not increasing. This means that size can not be an explanatory variable.

Cheers, ZHD

Footnote 44 discusses Neanderthals having larger brains, so it's not new data.
Thank you Carl. I am having some difficulty navigating to that discussion. Can you provide a direct link?
It's the link at the top of the OP. Look on page 38 of the document (page numbers are at the bottom) to find footnote 44.
Thanks for the help! So this is the footnote: That appears to be circular reasoning. It only implies that "marginal fitness return on cognition" has leveled off if we define fitness as a function of brain size—we have no fitness measurement otherwise. My previous suggestion, that the most important brain developments in our genus are independent of brain size, needs an explanation with a much different anchor.
There is speculation that brain size decreased due to loss of olfactory and maybe other sensory parts of the brain after dogs took over those functions. See here.

The whole counter-, counter-counter- thing is very difficult to follow. I've seen both you and Dennett use conversations between imagined participants to present such arguments, which I find vastly more readable.

I'm going to nitpick on Section 3.8:

If there are several “hard steps” in the evolution of intelligence, then planets on which intelligent life does evolve should expect to see the hard steps spaced about equally across their history, regardless of each step’s relative difficulty. [...]

[...] [...]

[...] the time interval from Australopithecus to Homo sapiens is too short to be a plausible hard step.

I don't think this argument is valid. Assuming there's a last hard step, you'd expect intelligence to show up soon after it was made (because there's no more ... (read more)

I may have found a minor problem on page 50:

Better algorithms could decrease the serial burden of a thought and allow more thoughts to occur in serial rather than parallel

Shouldn't that be "allow more thoughts to occur in parallel rather than serial"? Turning a thought from multiple parallel sub-tasks to one serial task increases the serial burden of that thought, rather than decreasing it.

3Eliezer Yudkowsky
The idea here is that you use parallelism to implement operations like caching which can decrease the number of serial steps required for a thought, so that more of them can occur one after another. In the simplest case, if you were already using a serial processor to emulate parallel computers, adding parallel power increases serial depth because you need no longer burn serialism to emulate parallelism.
Oh, so diverting serial processing cycles to get serial depth instead of getting half the depth over two independent tasks. I thought the sentence was saying something else entirely: that a better algorithm does the same thing except with higher serial depth over fewer processes.

Suppose I already believe that, because of computer science, neuroscience, etc, there will in the future be agents or coalitions of agents capable of outwitting human beings for control of the world, and that we can hope to shape the character of these future agents by working on topics like friendly AI. If I already believe that, do I need to read this?

It was fairly interesting, but for me more of an entertainment-budgeted item than a learning-budgeted item.

I have some questions about the math in the first couple pages, specifically the introduction of k. I'm not totally sure I follow exactly what's going on.

So, my assumption is that we're trying to model AI capacity as a function of investment, and I assume that we're modeling this as the integral of an exponential function of base k such that


with k held constant. The integral is necessary I believe to insure that the derivative of C is positive in both the k1 scenarios. This I believe matches the example of the nuclear c... (read more)

The lock problem from 3.8: Suppose there were 2 locks, one with a uniform solving distribution 5 hours long, and one with a uniform solving distribution 10 hours long. Now suppose we make a new probability distribution where first we solve lock one, then lock two, in times X and Y. The probability is now (up to the time limits) X/10*Y/5. Hey look, symmetry!

Now suppose we condition on the total time being 1 hour. So X+Y=1. But there's still symmetry between X and Y. So yeah.

When I read this segment, I was compelled to comment:

A key component of the debate between Robin Hanson and myself was the question of locality. Consider: If there are increasing returns on knowledge given constant human brains—this being the main assumption that many non-intelligence-explosion, general technological-hypergrowth models rely on, with said assumption seemingly well-supported by exponential technology-driven productivity growth—then why isn’t the leading human nation vastly ahead of the runner-up economy? Shouldn’t the economy with the mo

... (read more)

One point you don't address: While you justify the claim that intelligence is real thing and can be compared, you don't explain why it would be measurable in a numerical scale. In particular, I don't see what "linear increase in intelligence" and "exponential increase in intelligence" mean and how they can be compared.

Stylistically, I agree with many of the other comments and I think this paper is unsuitable for academic publication. You should keep out discussion of side issues like speculation on the bottlenecks in academic research, ... (read more)


Superficial stylistic remark: The paper repeatedly uses the word "agency" where "agent" would seem more appropriate.

[This comment is no longer endorsed by its author]Reply

What are the measurement units of "optimization power" and "complex order created per unit time"? What are the typical values for a human?

[This comment is no longer endorsed by its author]Reply

China is experiencing very fast knowledge-driven growth as it catches up to already-produced knowledge that it can cheaply import.

To the extent that AIs other than the most advanced project can generate self-improvements at all, they generate modifications of idiosyncratic code that can’t be cheaply shared with any other AIs.

I say it's at least as expensive for China to import knowledge. A fair amount is trade secrets that are more carefully guarded than AI content. China copies on the order of $1 trillion in value. What's the value of uncopied AI conte... (read more)


There are some places in the text that appear to be originally hyperlinked, but whose hyperlinks are not present in the .pdf. For example, footnote 21.

In general, the paper needs a technical editor.

EDIT: The lack of hyperlinks is clearly something on my end. I apologize for jumping to conclusions.

Tangent: Why isn't editing of academic papers done on github or another revision control platform?
MIRI uses Git to track edits for all documents it publishes with it's official template.
Do you mean editing done by the publishers? I don't know anything about that domain. As far as the writing of academic papers goes, I know a few groups that maintain a CVS, but some portion of mathematicians wouldn't be technical enough to run one. Of the examples I know, two groups only use a CVS because their PI told them to, over much groaning.
I was just struck how this is a perfect example where the efficient flow would go 'see problem->fix problem->submit pull request', rather than having to post here and hoping someone sees it and acts on it. It occurs to me that the person who makes revision control easy in the same way that social websites made having your own website easy will also make a billion dollars. Dropbox is both a good example of success and a good example of how much more could be done (trying to use dropbox for proper revision control is something I've seen attempted and it is not pretty).
Hell yeah. I have successfully explained to a bunch of not-very-technical managers why they would want version control: "Have you ever spent three days editing the wrong version of a document?" Wide eyes, all slowly nod. Once they understood what it was for, they would have crawled across broken glass for version control. (And since we were using ClearCase, they did pretty much that!) "Track changes" in Word solves a little of the problem from the other end.
I would anticipate that a github-like approach to editing would get you decent coverage of "local" editing issues (e.g., this hyperlink business) while not obtaining decent coverage of "global" editing issues (e.g., the use of "counter^3-argument", "counter^4-argument" in some places and "counter-counter-counter-argument" in another). A lot of this just boils down to having an acceptable style guide that an editor can enforce without worrying too much about taking every issue to the author for approval.
I like the way you're approaching the problem. However, I think the temptation for a familiar conclusion is too great and that you might be missing some possibilities. See: The solution you're putting forth suggests that there needs to be a single person in charge of coalescing the many suggestions and edits. But the great thing about version control is the ability to branch and tag. There could be an arbitrary number of editors who each have their own branch and set of improvements that they are working on—where non-editor contributors could switch branches and commit changes specific to that branch's needs. In the end, all branches would need to merge into the trunk. This process doesn't necessarily need a single editor either. Cheers
One person's "familiar conclusion" is another's "best practices", I suppose. Not really. Many suggestions and edits put forth by random people, e.g., here, aren't edits that I think an editor should really make. Nor do I really think a single person is necessary; again, a well-defined style guide would go a long way. I understand how CVSes work, and I have no problem with collaborative editing. But papers are not coding projects. There are a lot of global things going on that need to happen correctly. Even open source projects tend to have lead developers, no?
The hyperlink in footnote 21 works for me. It goes here. What happens when you click on "this online post" in footnote 21? We did use a couple editors on the paper, like we do with all our papers.
Nothing. Adobe Reader 11.0.2 on Windows 7. Yeah, I saw the percent signs were interpreted correctly. It's a work in progress.
MIRI's LaTeX document template uses the /href command to hyperlink text and styles links (both internal and external) using the pdfboarderstyle specification from Abode. We aren't doing anything unusual. Links are working (and styled) for me in OS X Preview and Adobe Reader 10.1.6, on OS 10.8.3. They even work in Chrome's pdf viewer which currently doesn't support pdfboarderstyle, i.e., the text is linked even though there is no underline or box to indicate that it is. I suspect something fishy is going on with your Reader install . . . Also, to clarify Luke's comments, we have a dedicated technical editor (who I have been very impressed with so far), and the papers are reviewed by a couple other people (once they have been typeset) before they are published. I'd be interested to hear about (possibly more appropriate through PM or email) other things in this document that made you think we didn't have a technical editor. EDIT: I should clarify that the editing and proofreading I'm talking about is done once the content has been finalized. See a definition of technical editing here.
PM sent. EDIT: I'm no longer sure that sending all of that over a PM (which I unwisely forgot to retain) was such a great idea. Your edit makes it sound like my objections weren't really under the aegis of "technical editing", but I don't recall objecting to anything that doesn't fall under that objection. Anyone who doubts my sincerity, please feel free to PM me.
Sorry, I didn't mean to imply that at all. I just reread my message an it occurred to me that it might not be clear to everyone what technical editing was. Your PM was indeed about technical edits. BTW you can see all PMs you sent by visiting,
Ohh! Thanks. I hadn't noticed that feature.