You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.

As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.

Philosophy as the anti-science...

What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.

This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.

A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.

The lens that sees its own flaws...

Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.

I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.

And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.

I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.

What next?

How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.

A note about effective altruism…

One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.

Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.

This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.

How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).

Where should I send my charitable donations?

Aubrey de Grey's SENS Research Foundation.

100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.

If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:

  • Robert Freitas and Ralph Merkle's Institute for Molecular Manufacturing does research on molecular nanotechnology. They are the only group that work on the long-term Drexlarian vision of molecular machines, and publish their research online.
  • Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.
  • B612 Foundation is a non-profit seeking to launch a spacecraft with the capability to detect, to the extent possible, ALL Earth-crossing asteroids.

I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.

Addendum regarding unfinished business

I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.

EDIT: Obviously I'll stick around long enough to answer questions below :)

New to LessWrong?

New Comment
273 comments, sorted by Click to highlight new comments since: Today at 10:30 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Thanks for sharing your contrarian views, both with this post and with your previous posts. Part of me is disappointed that you didn't write more... it feels like you have several posts' worth of objections to Less Wrong here, and at times you are just vaguely gesturing towards a larger body of objections you have towards some popular LW position. I wouldn't mind seeing those objections fleshed out in to long, well-researched posts. Of course you aren't obliged to put in the time & effort to write more posts, but it might be worth your time to fix specific flaws you see in the LW community given that it consists of many smart people interested in maximizing their positive impact on the far future.

I'll preface this by stating some points of general agreement:

  • I haven't bothered to read the quantum physics sequence (I figure if I want to take the time to learn that topic, I'll learn from someone who researches it full-time).

  • I'm annoyed by the fact that the sequences in practice seem to constitute a relatively static document that doesn't get updated in response to critiques people have written up. I think it's worth reading them with a grain of salt for that reason. (I'm

... (read more)
It seems I should have picked a different phrase to convey my intended target of ire. The problem isn't concept formation by means of comparing similar reference classes, but rather using thought experiments as evidence and updating on them. To be sure, thought experiments are useful for noticing when you are confused. They can also be semi-dark art in providing intuition pumps. Einstein did well in introducing special relativity by means of a series of thought experiments, by getting the reader to notice their confusion over classical electromagnetism in moving reference frames, then providing an intuition pump for how his own relativity worked in contrast. It makes his paper one of the most beautiful works in all of physics. However it was the experimental evidence which proved Einstein right, not the gedankenexperimenten. If a thought experiment shows something to not feel right, that should raise your uncertainty about whether your model of what is going on is correct or not (notice your confusion), to whit the correct response should be “how can I test my beliefs here?” Do NOT update on thought experiments, as thought experiments are not evidence. The thought experiment triggers an actual experiment—even if that experiment is simply looking up data that is already collected—and the actual experimental results is what updates beliefs. MIRI has not to my knowledge released any review of existing AGI architectures. If that is their belief, the onus is on them to support it. He invented the AI box game. If it's an experiment, I don't know what it is testing. It is a setup totatly divorced from any sane reality for how AGI might actually develop and what sort of controls might be in place, with built-in rules that favor the AI. Yet nevertheless, time and time again people such as yourself point me to the AI box games as if it demonstrated anything of note, anything which should cause me to update my beliefs. It is, I think, the examples of the sequences and th

If a thought experiment shows something to not feel right, that should raise your uncertainty about whether your model of what is going on is correct or not (notice your confusion), to whit the correct response should be “how can I test my beliefs here?”

I have such very strong agreement with you here.

The problem isn't concept formation by means of comparing similar reference classes, but rather using thought experiments as evidence and updating on them.

…but I disagree with you here.

Thought experiments and reasoning by analogy and the like are ways to explore hypothesis space. Elevating hypotheses for consideration is updating. Someone with excellent Bayesian calibration would update much much less on thought experiments etc. than on empirical tests, but you run into really serious problems of reasoning if you pretend that the type of updating is fundamentally different in the two cases.

I want to emphasize that I think you're highlighting a strength this community would do well to honor and internalize. I strongly agree with a core point I see you making.

But I think you might be condemning screwdrivers because you've noticed that hammers are really super-important.

Selecting a likely hypothesis for consideration does not alter that hypothesis' likelihood. Do we agree on that?
Hmm. Maybe. It depends on what you mean by "likelihood", and by "selecting". Trivially, noticing a hypothesis and that it's likely enough to justify being tested absolutely is making it subjectively more likely than it was before. I consider that tautological. If someone is looking at n hypotheses and then decided to pick the kth one to test (maybe at random, or maybe because they all need to be tested at some point so why not start with the kth one), then I quite agree, that doesn't change the likelihood of hypothesis #k. But in my mind, it's vividly clear that the process of plucking a likely hypothesis out of hypothesis space depends critically on moving probability mass around in said space. Any process that doesn't do that is literally picking a hypothesis at random. (Frankly, I'm not sure a human mind even can do that.) The core problem here is that most default human ways of moving probability mass around in hypothesis space (e.g. clever arguments) violate the laws of probability, whereas empirical tests aren't nearly as prone to that. So, if you mean to suggest that figuring out which hypothesis is worthy of testing does not involve altering our subjective likelihood that said hypothesis will turn out to be true, then I quite strongly disagree. But if you mean that clever arguments can't change what's true even by a little bit, then of course I agree with you. Perhaps you're using a Frequentist definition of "likelihood" whereas I'm using a Bayesian one?
There's a difference? Probability is probability. If you go about selecting a hypothesis by evaluating a space of hypotheses to see how they rate against your model of the world (whether you think they are true) and against each other (how much you stand to learn by testing them), you are essentially coming to reflective equilibrium regarding these hypothesis and your current beliefs. What I'm saying is that this shouldn't change your actual beliefs -- it will flush out some stale caching, or at best identify an inconsistent belief, including empirical data that you haven't fully updated on. But it does not, by itself, constitute evidence. So a clever argument might reveal an inconsistency in your priors, which in turn might make you want seek out new evidence. But the argument itself is insufficient for drawing conclusions. Even if the hypothesis is itself hard to test.

Perhaps you're using a Frequentist definition of "likelihood" whereas I'm using a Bayesian one?

There's a difference? Probability is probability.

There very much is a difference.

Probability is a mathematical construct. Specifically, it's a special kind of measure p on a measure space M such that p(M) = 1 and p obeys a set of axioms that we refer to as the axioms of probability (where an "event" from the Wikipedia page is to be taken as any measurable subset of M).

This is a bit like highlighting that Euclidean geometry is a mathematical construct based on following thus-and-such axioms for relating thus-and-such undefined terms. Of course, in normal ways of thinking we point at lines and dots and so on, pretend those are the things that the undefined terms refer to, and proceed to show pictures of what the axioms imply. Formally, mathematicians refer to this as building a model of an axiomatic system. (Another example of this is elliptic geometry, which is a type of non-Euclidean geometry, which you can model as doing geometry on a sphere.)

The Frequentist and Bayesian models of probability theory are relevantly different. They both think of M as the space of ... (read more)

Those are not different models. They are different interpretations of the utility of probability in different classes of applications. You do it exactly the same as in your Bayesian example. I'm sorry, but this Bayesian vs Frequentist conflict is for the most part non-existent. If you use probability to model the outcome of an inherently random event, people have called that “frequentist.” If instead you model the event as deterministic, but your knowledge over the outcome as uncertain, then people have applied the label “bayesian.” It's the same probability, just used differently. It's like how if you apply your knowledge of mechanics to bridge and road building, it's called civil engineering, but if you apply it to buildings it is architecture. It's still mechanical engineering either way, just applied differently. One of the failings of the sequences is the amount of emphasis that is placed on “Frequentist” vs “Bayesian” interpretations. The conflict between the two exists mostly in Yudkowsky's mind. Actual statisticians use probability to model events and knowledge of events simultaneously. Regarding the other points, every single example you gave involves using empirical data that had not sufficiently propagated, which is exactly the sort of use I am in favor of. So I don't know what it is that you disagree with.

Those are not different models. They are different interpretations of the utility of probability in different classes of applications.

That's what a model is in this case.

I'm sorry, but this Bayesian vs Frequentist conflict is for the most part non-existent.


One of the failings of the sequences is the amount of emphasis that is placed on “Frequentist” vs “Bayesian” interpretations. The conflict between the two exists mostly in Yudkowsky's mind. Actual statisticians use probability to model events and knowledge of events simultaneously.

How sure are you of that?

I know a fellow who has a Ph.D. in statistics and works for the Department of Defense on cryptography. I think he largely agrees with your point: professional statisticians need to use both methods fluidly in order to do useful work. But he also doesn't claim that they're both secretly the same thing. He says that strong Bayesianism is useless in some cases that Frequentism gets right, and vice versa, though his sympathies lie more with the Frequentist position on pragmatic grounds (i.e. that methods that are easier to understand in a Frequentist framing tend to be more useful in a wider range of circumstances in his e... (read more)

This seems like it would be true only if you'd already propagated all logical consequences of all observations you've made. But an argument can help me to propagate. Which means it can make me update my beliefs. For example, is 3339799 a prime number? One ought to assign some prior probability to it being a prime. A naive estimate might say, well, there are two options, so let's assign it 50% probability. You could also make a more sophisticated argument about the distribution of prime numbers spreading out as you go towards infinity, and given that only 25 of the first 100 numbers are prime, the chance that a randomly selected number in the millions should be prime is less than 25% and probably much lower. I claim that in a case like this it is totally valid to update your beliefs on the basis of an argument. No additional empirical test required before updating. Do you agree?
I think the definition of 'experiment' gets tricky and confusing when you are talking about math specifically. When you talk about finding the distribution of prime numbers and using that to arrive at a more accurate model for your prior probability of 3339799 being prime, that is an experiment. Math is unique in that regard though. For questions about the real world we must seek evidence that is outside of our heads.
Is that a conclusion or a hypothesis? I don't believe there is a fundamental distinction between "actual beliefs", "conclusions" and "hypotheses". What should it take to change my beliefs about this?
I'll think about how this can be phrased differently such that it might sway you. Given that you are not Valentine, is there a difference of opinion between his posts above and your views? That part you pulled out and quoted is essentially what I was writing about in the OP. There is a philosophy-over-hard-subjects which is pursued here, in the sequences, at FHI, and is exemplified in the conclusions drawn by Bostrom in Superintelligence, and Yudkowsky in the later sequences. Sometimes it works, e.g. the argument in the sequences about the compatibility of determinism and free will works because it essentially shows how non-determinism and free will are incompatible--it exposes a cached thought that free-will == non-deterministic choice which was never grounded in the first place. But over new subjects where you are not confused in the first place -- e.g. the nature and risk of superintelligence -- people seem to be using thought experiments alone to reach ungrounded conclusions, and not following up with empirical studies. That is dangerous. If you allow yourself to reason from thought experiments alone, I can get you to believe almost anything. I can't get you to believe the sky is green--unless you've never seen the sky--but anything you yourself don't have available experimental evidence for or against, I can sway you in either way. E.g. that consciousness is in information being computed and not the computational process itself. That an AI takeoff would be hard, not soft, and basically uncontrollable. That boxing techniques are foredoomed to failure irregardless of circumstances. That intelligence and values are orthogonal under all circumstances. That cryonics is an open-and-shut case. On these sorts of questions we need more, not less experimentation. When you hear a clever thought experiment that seems to demonstrate the truth of something you previously thought to have low probability, then (1) check if your priors here are inconsistent with each other;
My point is that the very statements you are making, that we are all making all the time, are also very theory-loaded, "not followed up with empirical studies". This includes the statements about the need to follow things up with empirical studies. You can't escape the need for experimentally unverified theoretical judgement, and it does seem to work, even though I can't give you a well-designed experimental verification of that. Some well-designed studies even prove that ghosts exist. The degree to which discussion of familiar topics is closer to observations than discussion of more theoretical topics is unclear, and the distinction should be cashed out as uncertainty on a case-by-case basis. Some very theoretical things are crystal clear math, more certain than the measurement of the charge of an electron. Being wrong is dangerous. Not taking theoretical arguments into account can result in error. This statement probably wouldn't be much affected by further experimental verification. What specifically should be concluded depends on the problem, not on a vague outside measure of the problem like the degree to which it's removed from empirical study. Before considering the truth of a statement, we should first establish its meaning, which describes the conditions for judging its truth. For a vague idea, there are many alternative formulations of its meaning, and it may be unclear which one is interesting, but that's separate from the issue of thinking about any specific formulation clearly.
I"m not aware of ghosts, Scott talks about telepathy and precognition studies.
Ghosts specifically seem like too complicated a hypothesis to extract from any experimental results I'm aware of. If we didn't already have a concept of ghosts, I doubt any parapsychology experiments that have taken place would have caused us to develop one.
People select hypotheses for testing because they have previously weakly updated in the direction of them being true. Seeing empirical data produces a later, stronger update.
6Rob Bensinger9y
Except that when the hypothesis space is large, people test hypotheses because they strongly updated in the direction of them being true, and seeing empirical data produces a later, weaker update. Where an example of 'strongly updating' could be going from 9,999,999:1 odds against a hypothesis to 99:1 odds, and an example of 'weakly updating' could be going from 99:1 odds against the hypothesis to 1:99. The former update requires about 20 bits of evidence, while the latter update requires about 10 bits of evidence.
Interesting point. I guess my intuitive notion of a "strong update" has to do with absolute probability mass allocation rather than bits of evidence (probability mass is what affects behavior?), but that's probably not a disagreement worth hashing out.
I like your way of saying it. It's much more efficient than mine!
Thanks! Paul Graham is my hero when it comes to writing and I try to pack ideas as tightly as possible. (I recently reread this essay of his and got amazed by how many ideas it contains; I think it has more intellectual content than most published nonfiction books, in just 10 pages or so. I guess the downside of this style is that readers may not go slow enough to fully absorb all the ideas. Anyway, I'm convinced that Paul Graham is the Ben Franklin of our era.)
Thanks for the response. I have a feeling your true rejection runs deeper than you're describing. You cite a thought experiment of Einstein's as being useful and correct. You explain that Less Wrong relies on thought experiments too heavily. You suggest that Less Wrong should lean heavier on data from the real world. But the single data point you cite on the question of thought experiments indicates that they are useful and correct. It seems like your argument fails by its own standard. I think the reliability of thought experiments is a tricky question to resolve. I think we might as well expand the category of thought experiments to "any reasoning about the world that isn't reasoning directly from data". When I think about the reliability of this reasoning, my immediate thought is that I expect some people to be much better at it than others. In fact, I think being good at this sort of reasoning is almost exactly the same as being intelligent. Reasoning directly from data is like looking up the answers in the back of the book. This leaves us with two broad positions: the “humans are dumb/the world is tricky” position that the only way we can ever get anywhere is through constant experimentation, and the “humans are smart/the world is understandable” position that we can usefully make predictions based on limited data. I think these positions are too broad to be useful. It depends a lot on the humans, and it depends a lot on the aspect of the world being studied. Reasoning from first principles works better in physics than in medicine; in that sense, medicine is a trickier subject to study. If the tricky world hypothesis is true for the questions MIRI is investigating, or the MIRI team is too dumb, I could see the sort of empirical investigations you propose as being the right approach: they don’t really answer the most important questions we want answered, but there probably isn’t a way for MIRI to answer those questions anyway, so might as well answer the qu
I cite a thought experiment of Einstein's as being useful but insufficient. It was not correct until observation matched anticipation. I called out Einstein's thought experiment as being a useful pedagogical technique, but not an example of how to arrive at truth. Do you see the difference? No, this is not obvious to me. Other than the ability of two humans to outwit each other within the confines of strict enforcement of arbitrarily selected rules, what is it testing, exactly? And what does that thing being tested have to do with realistic AIs and boxes anyway?

I called out Einstein's thought experiment as being a useful pedagogical technique, but not an example of how to arrive at truth.

What's your model of how Einstein in fact arrived at truth, if not via a method that is "an example of how to arrive at truth"? It's obvious the method has to work to some extent, because Einstein couldn't have arrived at a correct view by chance. Is your view that Einstein should have updated less from whatever reasoning process he used to pick out that hypothesis from the space of hypotheses, than from the earliest empirical tests of that hypothesis, contra Einstein's Arrogance?

Or is your view that, while Einstein may technically have gone through a process like that, no one should assume they are in fact Einstein -- i.e., Einstein's capabilities are so rare, or his methods are so unreliable (not literally at the level of chance, but, say, at the level of 1000-to-1 odds of working), that by default you should harshly discount any felt sense that your untested hypothesis is already extremely well-supported?

Or perhaps you should harshly discount it until you have meta-evidence, in the form of a track record of successfully predicting which un... (read more)

You can't work backwards from the fact that someone arrived at truth in one case to the the premise that they must have been working from a reliable method for arriving at truth. It's the "one case" that's the problem. They might have struck lucky. Einstein's thought experiments inspired his formal theories, which were then confirmed by observation. Nobody thought the thought experiments provided confirmation by themselves.
8Rob Bensinger9y
I mentioned that possibility above. But Einstein couldn't have been merely lucky -- even if it weren't the case that he was able to succeed repeatedly, his very first success was too improbable for him to have just plucking random physical theories out of a hat. Einstein was not a random number generator, so there was some kind of useful cognitive work going on. That leaves open the possibility that it was only useful enough to give Einstein a 1% chance of actually being right; but still, I'm curious about whether you do think he only had a 1% chance of being right, or (if not) what rough order of magnitude you'd estimate. And I'd likewise like to know what method he used to even reach a 1% probability of success (or 10%, or 0.1%), and why we should or shouldn't think this method could be useful elsewhere. Can you define "confirmation" for me, in terms of probability theory?
Big Al may well have had some intuitive mojo that enabled him to pick the right thought experiments , but that still doesn't make thought experiments a substitute for real empiricism. And intuitive mojo, isnt a method in the sense of vbeing reproducible. Why not derive probability theory in terms of confirmation.?
8Rob Bensinger9y
Thought experiments aren't a replacement for real empiricism. They're a prerequisite for real empiricism. "Intuitive mojo" is just calling a methodology you don't understand a mean name. However Einstein repeatedly hit success in his lifetime, presupposing that it is an ineffable mystery or a grand coincidence won't tell us much. I already understand probability theory, and why it's important. I don't understand what you mean by "confirmation," how your earlier statement can be made sense of in quantitative terms, or why this notion should be treated as important here. So I'm asking you to explain the less clear term in terms of the more clear term.
Actually he did not. He got lucky early in his career, and pretty much coasted on that into irrelevance. His intuition allowed him to solve problems related to relativity, the photoelectric effect, Brownian motion, and a few other significant contributions within the span of a decade, early in his career. And then he went off the deep end following his intuition down a number of dead-ending rabbit holes for the rest of his life. He died in Princeton in 1955 having made no further significant contributions to physics after is 1916 invention of general relativity. Within the physics community (I am a trained physicist), Einstein's story is retold more often as a cautionary tale than a model to emulate.
There are worse fates than not being able to top your own discovery of general relativity.
...huh? Correct me if I'm wrong here, but Einstein was a great physicist who made lots of great discoveries, right? The right cautionary tale would be to cite physicists who attempted to follow the same strategy Einstein did and see how it mostly only worked for Einstein. But if Einstein was indeed a great physicist, it seems like at worst his strategy is one that doesn't usually produce results but sometimes produces spectacular results... which doesn't seem like a terrible strategy. I have a very strong (empirical!) heuristic that the first thing people should do if they're trying to be good at something is copy winners. Yes there are issues like regression to the mean and stuff, but it provides a good alternative perspective vs thinking things through from first principles (which seems to be my default cognitive strategy).
The thing is Einstein was popular, but his batting average was less than his peers. In terms of advancing the state of the art, the 20th century is full of theoretical physicists that have a better track record for pushing the state of the art forward than Einstein, most of whom did not spend the majority of their career chasing rabbits down holes. They may not be common household names, but honestly that might have more to do with the hair than his physics.
I should point out that I heard this cautionary tale as "don't set your sights too high," not "don't employ the methods Einstein employed." The methods were fine, the trouble was that he was at IAS and looking for something bigger than his previous work, rather than planting acorns that would grow into mighty oaks (as Hamming puts it).
OK, good to know.
The AI box experiment only serves even as that if you assume that the AI box experiment sufficiently replicates the conditions that would actually be faced by someone with an AI in a box. Also, it only serves as such if it is otherwise a good experiment, but since we are not permitted to see the session transcripts for ourselves, we can't tell if it is a good experiment.
3Rob Bensinger9y
Again, the AI box experiment is a response to the claim "superintelligences are easy to box, because no level of competence at social engineering would suffice for letting an agent talk its way out of a box". If you have some other reason to think that superintelligences are hard to box -- one that depends on a relevant difference between the experiment and a realistic AI scenario -- then feel free to bring that idea up. But this constitutes a change of topic, not an objection to the experiment. I mean, the experiment's been replicated multiple times. And you already know the reasons the transcripts were left private. I understand assigning a bit less weight to the evidence because you can't examine it in detail, but the hypothesis that there's a conspiracy to fake all of these experiments isn't likely.
Not all relevant differences between an experiment and an actual AI scenario can be accurately characterized as "reason to think that superintelligences are hard to box". For instance, imagine an experiment with no gatekeeper or AI party at all, where the result of the experiment depends on flipping a coin to decide whether the AI gets out. That experiment is very different from a realistic AI scenario, but one need not have a reason to believe that intelligences are hard to box--or even hold any opinion at all on whether intelligences are hard to box--to object to the experimental design. For the AI box experiment as stated, one of the biggest flaws is that the gatekeeper is required to stay engaged with the AI and can't ignore it. This allows the AI to win by either verbally abusing the gatekeeper to the extent that he doesn't want to stay around any more, or by overwhelming the gatekeeper with lengthy arguments that take time or outside assistance to analyze. These situations would not be a win for an actual AI in a box. Refusing to release the transcripts causes other problems than just hiding fakery. If the experiment is flawed in some way, for instance, it could hide that--and it would be foolish to demand that everyone name possible flaws one by one and ask you "does this have flaw A?", "does this have flaw B?", etc. in order to determine whether the experiment has any flaws. There are also cases where whether something is a flaw is an opinion that can be argued, and it might be that someone else would consider a flaw something that the experimenter doesn't. Besides, in a real boxed AI situation, it's likely that gatekeepers will be tested on AI-box experiments and will be given transcripts of experiment sessions to better prepare them for the real AI. An experiment that simulates an AI boxing should likewise have participants be able to read other sessions.
BTW, I realized there's something else I agree with you on that's probably worth mentioning: Eliezer in particular, I think, is indeed overconfident in his ability to reason things out from first principles. For example, I think he was overconfident in AI foom (see especially the bit at the end of that essay). And even if he's calibrated his ability correctly, it's totally possible that others who don't have the intelligence/rationality he does could pick up the "confident reasoning from first principles" meme and it would be detrimental to them. That said, he's definitely a smart guy and I'd want to do more thinking and research before making a confident judgement. What I said is just my current estimate. Insofar as I object to your post, I'm objecting to the idea that empiricism is the be-all and end-all of rationality tools. I'm inclined to think that philosophy (as described in Paul Graham's essay) is useful and worth learning about and developing.
For a start.... there's also a lack of discernible point in a lot of places. But too much good stuff to justify rejecting the whole thing.
Yes, I do. Intuitively, this seems correct. But I'd still like to see you expound on the idea.
BTW, this discussion has some interesting parallels to mine & Mark's.
This example actually proves the opposite. Bitcoin was described in a white paper that wasn't very impressive by academic crypto standards - few if anyone became interested in Bitcoin from first reading the paper in the early days. It's success was proven by experimentation, not pure theoretical investigation. It's hard to investigate safety if one doesn't know the general shape that AGI will finally take. MIRI has focused on a narrow subset of AGI space - namely transparent math/logic based AGI. Unfortunately it is becoming increasingly clear that the Connectionists were more or less absolutely right in just about every respect . AGI will likely take the form of massive brain-like general purpose ANNs. Most of MIRI's research thus doesn't even apply to the most likely AGI candidate architecture.
In this essay I wrote: I'm guessing this is likely to be true of general-purpose ANNs, meaning recursive self-improvement would be more difficult for a brain-like ANN than it might be for some other sort of AI? (This would be somewhat reassuring if it was true.)
It's not clear that there is any other route to AGI - all routes lead to "brain-like ANNs", regardless of what linguistic label we use (graphical models, etc). General purpose RL - in ideal/optimal theoretical form - already implements recursive self-improvement in the ideal way. If you have an ideal/optimal general RL system running, then there are no remaining insights you could possibly have which could further improve its own learning ability. The evidence is accumulating that general Bayesian RL can be efficiently approximated, that real brains implement something like this, and that very powerful general purpose AI/AGI can be built on the same principles. Now, I do realize that by "recursive self-improvement" you probably mean a human level AGI consciously improving its own 'software design', using slow rule based/logic thinking of the type suitable for linguistic communication. But there is no reason to suspect that the optimal computational form of self-improvement should actually be subject to those constraints. The other, perhaps more charitable view of "recursive self-improvement" is the more general idea of the point in time where AGI engineers/researchers takeover most of the future AGI engineering/research work. Coming up with new learning algorithms will probably be only a small part of the improvement work at that point. Implementations however can always be improved, and there is essentially an infinite space of better hardware designs. Coming up with new model architectures and training environments will also have scope for improvement. Also, it doesn't really appear to matter much how many modules the AGI has, because improvement doesn't rely much on human insights into how each module works. Even with zero new 'theoerical' insights, you can just run the AGI on better hardware and it will be able to think faster or split into more copies. Either way, it will be able to speed up the rate at which it soaks up knowledge and automatically rewires
By experimentation, do you mean people running randomized controlled trials on Bitcoin or otherwise empirically testing hypotheses on the software? Just because your approach is collaborative and incremental doesn't mean that it's empirical.
Not really - by experimentation I meant proving a concept by implementing it and then observing whether the implementation works or not, as contrasted to the pure math/theory approach where you attempt to prove something abstractly on paper. For context, I was responding to your statement: Bitcoin is an example of typical technological development, which is driven largely by experimentation/engineering rather than math/theory. Theory is important mainly as a means to generate ideas for experimentation.

Despite Yudkowsky's obvious leanings, the Sequences are ... first and foremost about how to not end up an idiot

My basic thesis is that even if that was not the intent, the result has been the production of idiots. Specifically, a type of idiotic madness that causes otherwise good people, self-proclaimed humanitarians to disparage the only sort of progress which has the potential to alleviate all human suffering, forever, on accelerated timescales. And they do so for reasons that are not grounded in empirical evidence, because they were taught though demonstration modes of non-empirical thinking from the sequences, and conditioned to think this was okay through social engagement on LW.

When you find yourself digging a hole, the sensible and correct thing to do is stop digging. I think we can do better, but I'm burned out on trying to reform from the inside. Or perhaps I'm no longer convinced that reform can work given the nature of the medium (social pressures of blog posts and forums work counter to the type of rationality that should be advocated for).

I don't care about Many Worlds, FAI, Fun theory and Jeffreyssai stuff, but LW was the thing that stopped me from being a comple

... (read more)
I was actually making a specific allusion to the hostility towards practical, near-term artificial general intelligence work. I have at times in the past advocated for working on AGI technology now, not later, and been given robotic responses that I'm offering reckless and dangerous proposals, and helpfully directed to go read the sequences. I once joined #lesswrong on IRC and introduced myself as someone interested in making progress in AGI in the near-term, and received two separate death threats (no joke). Maybe that's just IRC—but I left and haven't gone back.
Things have changed, believe me.
I don't know exactly what process generates the featured articles, but I don't think it has much to do with the community's current preoccupations.
My point was that it has become a lot more tolerant.
Maybe, but the core beliefs and cultural biases haven't changed, in the years that I've been here.
But you didn't get karmassinated or called an idiot.
This is true. I did not expect the overwhelmingly positive response I got...
If I understand correctly, you think that LW, MIRI and other closely related people might have a net negative impact, because they distract some people from contributing to the more productive subareas/approaches of AI research and existential risk prevention, directing them to subareas which you estimate to be much less productive. For the sake of argument, let's assume that is correct and if all people who follow MIRI's approach to AGI turned to those subareas of AI that are more productive, it would be a net benefit to the world. But you should consider the other side of the medallion, that is, doesn't blogs like LessWrong or books like that of N.Bostrom's actually attract some students to consider working on AI, including the areas you consider beneficial, who would otherwise be working in areas that are unrelated to AI? Wouldn't the number of people who have even heard about the concept of existential risks be smaller without people like Yudkowsky and Bostrom? I don't have numbers, but since you are concerned about brain drain in other subareas of AGI and existential risk research, do you think it is unlikely that popularization work done by these people would attract enough young people to AGI in general and existential risks in general that would compensate for the loss of a few individuals, even in subareas of these fields that are unrelated to FAI? But do people here actually fight progress? Has anyone actually retired from (or was dissuaded from pursuing) AI research after reading Bostrom or Yudkowsky? If I understand you correctly, you fear that concerns about AI safety, being a thing that might invoke various emotions in a listener's mind, is a thing that is sooner or later bound to be picked up by some populist politicians and activists who would sow and exploit these fears in the minds of general population in order to win elections/popularity/prestige among their peers/etc., thus leading to various regulations and restrictions on funding, because th
I'm not sure how someone standing on a soapbox and yelling "AI is going to kill us al!" (Bostrom, admittedly not a quote) can be interpreted as actually helping get more people into practical AI research and development. You seem to be presenting a false choice: is there more awareness of AI in a world with Bostrom et al, or the same world without? But it doesn't have to be that way. Ray Kurzweil has done quite a bit to keep interest in AI alive without fear mongering. Maybe we need more Kurzweils and fewer Bostroms.
Data point: a feeling that I ought to do something about AI risk is the only reason why I submitted an FLI grant proposal that involves some practical AI work, rather than just figuring that the field isn't for me and doing something completely different.
I don't know how many copies of Bostrom's book were sold, but it was on New York Times Best Selling Science Books list. Some of those books were read by high school students. Since very few people leave practical AI research for FAI research, even if only a tiny fraction of these young readers read the book and think "This AI thing is really exciting and interesting. Instead of majoring in X (which is unrelated to AI), I should major in computer science and focus on AI", it would probably result in net gain for practical AI research. I argued against this statement: When people say that an action leads to a negative outcome, they usually mean that taking that action is worse than not taking it, i.e. they compare the result to zero. If you add another option, then the word "suboptimal" should be used instead. Since I argued against "negativity", and not "suboptimality", I dont' think that the existence of other options is relevant here.
Interesting, I seem to buck the herd in nearly exactly the opposite manner as you.

You buck the herd by saying their obsession with AI safety is preventing them from participating in the complete transformation of civilization.

I buck the herd by saying that the whole singulatarian complex is a chimera that has almost nothing to do with how reality will actually play out and its existence as a memeplex is explained primarily by sociological factors rather than having much to do with actual science and technology and history.

Oh, well I mostly agree with you there. Really ending aging will have a transformative effect on society, but the invention of AI is not going to radically alter power structures in the way that singulatarians imagine.

See, I include the whole 'immanent radical life extension' and 'Drexlerian molecular manufacturing' idea sets in the singulatarian complex...

The craziest person in the world can still believe the sky is blue.
Ah, but in this case as near as i can tell it is actually orange.
"The medical revolution that began with the beginning of the twentieth century had warped all human society for five hundred years. America had adjusted to Eli Whitney's cotton gin in less than half that time. As with the gin, the effects would never quite die out. But already society was swinging back to what had once been normal. Slowly; but there was motion. In Brazil a small but growing, alliance agitated for the removal of the death penalty for habitual traffic offenders. They would be opposed, but they would win." Larry Niven: The Gift From Earth

Well there are some serious ramifications that are without historical precedent. For example, without menopause it may perhaps become the norm for women to wait until retirement to have kids. It may in fact be the case that couples will work for 40 years, have a 25-30 year retirement where they raise a cohort of children, and then re-enter the work force for a new career. Certainly families are going to start representing smaller and smaller percentages of the population as birth rates decline while people get older and older without dying. The social ramifications alone will be huge, which was more along the lines of what I was talking about.

This just seems stupid to me. Ending aging is fundamentally SLOW change. In 100 or 200 or 300 years from now, as more and more people gain access to anti-aging (since it will start off very expensive), we can worry about that. But conscious AI will be a force in the world in under 50 years. And it doesn't even have to be SUPER intelligent to cause insane amounts of social upheaval. Duplicability means that even 1 human level AI can be world-wide or mass produced in a very short time!
"Will"? You guarantee that?
Can you link to a longer analysis of yours regarding this? I simply feel overwhelmed when people discuss AI. To me intelligence is a deeply anthropomorphic category, includes subcategories like having a good sense of humor. Reducing it to optimization, without even sentience or conversational ability with self-consciousness... my brain throws out the stop sign already at this point and it is not even AI, it is the pre-studies of human intelligence that already dehumanize, deanthromorphize the idea of intelligence and make it sound more like a simple and brute-force algorithm. Like Solomonoff Induction, another thing that my brain completely freezes over: how can you have truth and clever solutions without even really thinking, just throwing a huge number of random ideas in and seeing what survives testing? Would it all be so quantitative? Can you reduce the wonderful qualities of the human mind to quantities?

Intelligence to what purpose?

Nobody's saying AI will be human without humor, joy, etc. The point is AI will be dangerous, because it'll have those aspects of intelligence that make us powerful, without those that make us nice. Like, that's basically the point of worrying about UFAI.

But is it possible to have power without all the rest?
Certainly. Why not? Computers already can outperform you in a wide variety of tasks. Moreover, today, with the rise of machine learning, we can train computers to do pretty high-level things, like object recognition or senitment analysis (and sometimes outperform humans in these tasks). Isn't it power? As for Solomonoff induction... What do you think your brain is doing when you are thinking? Some kind of optimized search in hypotheses space, so you consider only a very very small set of hypotheses (compared to the entire space), hopefully good enough ones. While Solomonoff induction checks all of them, every single hypothesis, and finds the best. Solomonoff induction is so much thinking that it is incomputable. Since we don't have that much raw computing power (and never will have), the hypotheses search must be heavily optimized. Throwing off unpromising directions of search. Searching in regions with high probability of success. Using prior knowledge to narrow search. That's what your brain is doing, and that's what machines will do. That's not like "simple and brute-force", because simple and brute-force algorithms are either impractically slow, or incomputable at all.
Eagles, too: they can fly and I not. The question is whether the currently foreseeable computerizable tasks are closer to flying or to intelligence. Which in turn depends on how high and how "magic" we see intelligence. Ugh, using Aristotelean logic? So it is not random hypotheses but causality and logic based. I think using your terminology, thinking is not the searching, it is the findinging logical relationships so not a lot of space must be searched. OK, that makes sense. Perhaps we can agree that logic and casuality and actual reasoning is all about narrowing the hypothesis space to search. This is intelligence, not the search.
I'm starting to suspect that we're arguing on definitions. By search I mean the entire algorithm of finding the best hypothesis; both random hypothesis checking and Aristotelian logic (and any combination of these methods) fit. What do you mean? Narrowing the hypothesis space is search. Once you narrowed the hypotheses space to a single point, you have found an answer. As for eagles: if we build a drone that can fly as well as an eagle can, I'd say that the drone has an eagle-level flying ability; if a computer can solve all intellectual tasks that a human can solve, I'd say that the computer has a human-level intelligence.
Yes. Absolutely. When that happens inside a human being's head, we generally call them 'mass murderers'. Even I only cooperate with society because there is a net long term gain in doing so; if that were no longer the case, I honestly don't know what I would do. Awesome, that's something new to think about. Thanks.
That's probably irrelevant, because mass murderers don't have power without all the rest. They are likely to have sentience and conversational ability with self-consciousness, at least.
Not sure. Suspect nobody knows, but seems possible? I think the most instructive post on this is actually Three Worlds Collide, for making a strong case for the arbitrary nature of our own "universal" values.

Despite Yudkowsky's obvious leanings, the Sequences are not about FAI, nor [etc]...they are first and foremost about how to not end up an idiot. They are about how to not become immune to criticism, they are about Human's Guide to Words, they are about System 1 and System 2.

I've always had the impression that Eliezer intended them to lead a person from zero to FAI. So I'm not sure you're correct here.

...but that being said, the big Less Wrong takeaways for me were all from Politics is the Mind-Killer and the Human's Guide to Words -- in that those are the ones that have actually changed my behavior and thought processes in everyday life. They've changed the way I think to such an extent that I actually find it difficult to have substantive discussions with people who don't (for example) distinguish between truth and tribal identifiers, distinguish between politics and policy, avoid arguments over definitions, and invoke ADBOC when necessary. Being able to have discussions without running over such roadblocks is a large part of why I'm still here, even though my favorite posters all seem to have moved on. Threads like this one basically don't happen anywhere else that I'm aware of.

Someone recently had a blog post summarizing the most useful bits of LW's lore, but I can't for the life of me find the link right now.

Eliezer states this explicitly on numerous occasions, that his reason for writing the blog posts was to motivate people to work with him on FAI. I'm having trouble coming up with exact citations however, since it's not very google-able. My prior perception of the sequences was that EY started from a firm base of generlaly good advice about thinking. Sequences like Human's guide to words and How to actually change your mind stand on their own. He then however went off the deep end trying to extend and apply these concepts to questions in the philosophy of the mind, ethics, and decision theory in order to motivate an interest in friendly AI theory. I thought that perhaps the mistakes made in those sequences where correctable one-off errors. Now I am of the opinion that the way in which that philosophical inquiry was carried out doomed the project to failure from the start, even if the details of the failure is subject to Yudkowsky's own biases. Reasoning by thought experiment only over questions that are not subject to experimental validation basically does nothing more than expose one's priors. And either you agree with the priors, or you don't. For example, does quantum physics support the assertion that identity is the instance of computation or the information being computed? Neither. But you could construct a thought experiment which validates either view based on the priors you bring to the discussion, and I have wasted much time countering his thought experiments with those of my own creation before I understood the Sisyphean task I was undertaking :\
I'm not sure if this is what you were thinking of (seeing as how it's about a year old now), but "blog post summarizing the most useful bits of LW's lore" makes me think of Yvain's Five Years and One Week of Less Wrong.
As another person who thinks that the Sequences and FAI are nonsense (more accurately, the novel elements in the Sequences are nonsense; most of them are not novel), I have my own theory: LW is working by accidentally being counterproductive. You have people with questionable beliefs, who think that any rational person would just have to believe them. So they try to get everyone to become rational, thinking it would increase belief in those things. Unfortunately for them, when they try this, they succeed too well--people listen to them and actually become more rational, and actually becoming rational doesn't lead to belief in those things at all. Sometimes it even provides more reasons to oppose those things--I hadn't heard of Pascal's Mugging before I came here, and it certainly wasn't intended to be used as an argument against cryonics or AI risk, but it's pretty useful for that purpose anyway.
Clarification: I don't think they're nonsense, even though I don't agree with all of them. Most of them just haven't had the impact of PMK and HGW.
How is Pascal's Mugging an argument against cryonics?
It's an argument against "even if you think the chance of cryonics working is low, you should do it because if it works, it's a very big benefit".
Ok, it's an argument against a specific argument for cryonics. I'm ok with that (it was a bad argument for cryonics to start with). Cryonics does have a lot of problems, not least of which is cost. The money spent annually on life insurance premiums for cryopreservation of a ridiculously tiny segment of the population is comparable to the research budget for SENS which would benefit everybody. What is up with that. That said, I'm still signing up for Alcor. But I'm aware of the issues :\

On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.

This part isn't clear to me. The researcher who goes into generic anti-cancer work, instead of SENS-style anti-aging work, probably has made an epistemic mistake with moderate consequences, because of basic replaceability arguments.

But to say that MIRI's approach to AGI safety is due to a philosophical mistake, and one with significant consequences, seems like it requires much stronger knowledge. Shooting very high instead of high is riskier, but not necessarily wronger.

Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).

I think you underestimate how much MIRI agrees with FLI.

Why they do not get more play in the effective altruism community is beyond me.

SENS is the second largest part of my charity budget, and I recommend it to my friends every year (on the obvious day to do so). My speculations on why EAs don't favor them more highly mostly have to do with the difficulty of measuring progress in medical research vs. fighting illnesses, and possibly also the specter of selfishness.

I think you underestimate how much MIRI agrees with FLI.

Agreed - or, at least, he underestimates how much FLI agrees with MIRI. This is pretty obvious e.g. in the references section of the technical agenda that was attached to FLI's open letter. Out of a total of 95 references:

  • Six are MIRI's technical reports that've only been published on their website: Vingean Reflection, Realistic World-Models, Value Learning, Aligning Superintelligence, Reasoning Under Logical Uncertainty, Toward Idealized Decision Theory
  • Five are written by MIRI's staff or Research Associates: Avoiding Unintended AI Behaviors, Ethical Artificial Intelligence, Self-Modeling Agents and Reward Generator Corruption, Problem Equilibirum in the Prisoner's Dilemma, Corrigibility,
  • Eight are ones that tend to agree with MIRI's stances and which have been cited in MIRI's work: Superintelligence, Superintelligent Will, Singularity A Philosophical Analysis, Speculations concerning the first ultraintelligent machine, The nature of self-improving AI, Space-Time Embedded Intelligence, FAI: the Physics Challenge, The Coming Technological Singularity

That's 19/95 (20%) references produced either directly by MIRI or people closely associated with them, or that have MIRI-compatible premises.

I think you and Vaniver both misunderstood my endorsement of FLI. I endorse them not because of their views on AI risk, which are in line MIRI's and entirely misguided in my opinion. But the important question is not what you believe, but what you do about it. Despite those views they are still willing to fund practical, evidence-based research into artificial intelligence, engaging with the existing community rather than needlessly trying to reinvent the field.
To be clear this is not an intended implication. I'm aware that Yudkowsky supports SENS, and indeed my memory is fuzzy but it might have been though exactly the letter you quote that I first heard about SENS.
Just out of curiosity, what day is that? Both Christmas and April 15th came to mind.

My birthday. It is both when one is supposed to be celebrating aging / one's continued survival, and when one receives extra attention from others.


Oh that's a great idea. I'm going to start suggesting people who ask to donate to one of my favorite charities on my birthday. It beats saying I don't need anything which is what I currently do.

Consider also doing an explicit birthday fundraiser. I did one on my most recent birthday and raised $500 for charitable causes.

Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.

I guess I'm not sure how these concerns could possibly be addressed by any platform meant for promoting ideas. You cannot run a lab in your pocket. You can have citations to evidence found by people who do run labs...but that's really all you can do. Everything else must necessarily be a thought experiment.

So my question is, can you envision a better version, and what would be some of the ways that it would be different? (Because if you can, it aught to be created.)


On a personal level, I am no longer sure engagement with such a community is a net benefit.

I am simply treating LW as a 120+ IQ version of Reddit. Just generic discussion with mainly bright folks. The point is, I don't know any other. I used to know Digg and MetaFilter and they are not much better either. If we could make a list of cerebral discussion boards, forums, and suchlike, that would be a good idea I guess. Where do you expect to hang out in the future?


I'll probably spend less time hanging out in online communities, honestly.

When you mention IQ, Mensa seems like an obvious answer. They are already filtering people for IQ, so it could be worth: a) finding out whether they have an online forum for members, and if not, then b) creating such forum. Obviously, the important thing is not merely filtering of the users, but also moderation mechanisms, otherwise you can get a few "high IQ idiots" spam the whole system with some conspiracy theories or flame wars. Unfortunately, you would probably realize that IQ is not the most important thing.

Unfortunately, you would probably realize that IQ is not the most important thing.

Precisely. The Mensa members I know are too interested in rather pointless brain-teasers.

I generally agree with your position on the Sequences, but it seems to me that it is possible to hang around this website and have meaningful discussions without worshiping the Sequences or Eliezer Yudkowsky. At least it works for me.
As for being a highly involved/high status member of the community, especially the offline one, I don't know.

Anyway, regarding the point about super-intelligence that you raised, I charitably interpret the position of the AI-risk advocates not as the claim that super-intelligence would be in principle outside the scope of human scientific inquiry, but as the claim that a super-intelligent agent would be more efficient at understanding humans that humans would be at understanding it, giving the super-intelligent agent and edge over humans.

I think that the AI-risk advocates tend to exaggerate various elements of their analysis: they probably underestimate time to human-level AI and time to super-human AI, they may overestimate the speed and upper bounds to recursive self-improvement (their core arguments based on exponential growth seem, at best, unsupported).

Moreover, it seems that they tend to conflate super-intelligence with a sort of near-omniscience... (read more)

I think that the AI-risk advocates tend to exaggerate various elements of their analysis: they probably underestimate time to human-level AI and time to super-human AI

It's worth keeping in mind that AI-risk advocates tend to be less confident that AGI is nigh than the top-cited scientists within AI are. People I know at MIRI and FHI are worried about AGI because it looks like a technology that's many decades away, but one where associated safety technologies are even more decades away.

That's consistent with the possibility that your criticism could turn out to be right. It could be that we're less wrong than others on this metric and yet still very badly wrong in absolute terms. To make a strong prediction in this area is to claim to already have a pretty good computational understanding of how general intelligence works.

Moreover, it seems that they tend to conflate super-intelligence with a sort of near-omniscience: They seem to assume that a super-intelligent agent will be a near-optimal Bayesian reasoner

Can you give an example of a statement by a MIRI researcher that is better predicted by 'X is speaking of the AI as a near-optimal Bayesian' than by 'X is speaking of the... (read more)

Cite? I think I remember Eliezer Yudkowsky and Luke Muehlhauser going for the usual "20 years from now" (in 2009) time to AGI prediction. By contrast Andrew Ng says "Maybe hundreds of years from now, maybe thousands of years from now". Maybe they are not explicitly saying "near-optimal", but it seems to me that they are using models like Solomonoff Induction and AIXI as intuition pumps, and they are getting these beliefs of extreme intelligence from there. Anyway, do you disagree that MIRI in general expects the kind of low-data, low-experimentation, prior-driven learning that I talked about to be practically possible?

Maybe they are not explicitly saying "near-optimal", but it seems to me that they are using models like Solomonoff Induction and AIXI as intuition pumps, and they are getting these beliefs of extreme intelligence from there.

I don't think anyone at MIRI arrived at worries like 'AI might be able to deceive their programmers' or 'AI might be able to design powerful pathogens' by staring at the equation for AIXI or AIXItl. AIXI is a useful idea because it's well-specified enough to let us have conversations that are more than just 'here are my vague intuitions vs. your vague-intuitions'; it's math that isn't quite the right math to directly answer our questions, but at least gets us outside of our own heads, in much the same way that an empirical study can be useful even if it can't directly answer our questions.

Investigating mathematical and scientific problems that are near to the philosophical problems we care about is a good idea, when we still don't understand the philosophical problem well enough to directly formalize or test it, because it serves as a point of contact with a domain that isn't just 'more vague human intuitions'. Historically this has often been a goo... (read more)

In his quantum physics sequence, where he constantly talks (rants, actually) about Solomonoff Induction, Yudkowsky writes: Anna Salamon also mentions AIXI when discussing the feasibility of super-intelligence. Mind you, I'm not saying that AIXI is not an interesting and possibly theoretically useful model, my objection is that MIRI people seem to have used it to set a reference class for their intuitions about super-intelligence. Extrapolation is always an epistemically questionable endeavor. Intelligence is intrinsically limited by how predictable the world is. Efficiency (time complexity/space complexity/energy complexity/etc.) of algorithms for any computational task is bounded. Hardware resources also have physical limits. This doesn't mean that given our current understanding we can claim that human-level intelligence is an upper bound. That would be most likely false. But there is no particular reason to assume that the physically attainable bound will be enormously higher than human-level. The more extreme the scenario, the less probability we should assign to it, reasonably according to a light-tailed distribution. Ok, but my point is that it has not been established that progress in mathematics will automatically grant an AI "superpowers" in the physical world. And I'd even say that even superpowers by raw cognitive power alone are questionable. Theorem proving can be sped up, but there is more to math than theorem proving.
3Rob Bensinger9y
I think Eliezer mostly just used "Bayesian superintelligence" as a synonym for "superintelligence." The "Bayesian" is there to emphasize the fact that he has Bayes-optimality as a background idea in his model of what-makes-cognition-work and what-makes-some-cognition-work-better-than-other-kinds, but Eliezer thought AI could take over the world long before he knew about AIXI or thought Bayesian models of cognition were important. I don't know as much about Anna's views. Maybe she does assign more weight to AIXI as a source of data; the example you cited supports that. Though since she immediately follows up her AIXI example with "AIXI is a theoretical toy. How plausible are smarter systems in the real world?" and proceeds to cite some of the examples I mentioned above, I'm going to guess she isn't getting most of her intuitions about superintelligence from AIXI either. I think our disagreement is about what counts as "extreme" or "extraordinary", in the "extraordinary claims require extraordinary evidence" sense. If I'm understanding your perspective, you think we should assume at the outset that humans are about halfway between 'minimal intelligence' and 'maximal intelligence' -- a very uninformed prior -- and we should then update only very weakly in the direction of 'humans are closer to the minimum than to the maximum'. Claiming that there's plenty of room above us is an 'extreme' claim relative to that uninformed prior, so the epistemically modest thing to do is to stick pretty close the assumption that humans are 'average', that the range of intelligence exhibited in humans with different cognitive abilities and disabilities represents a non-tiny portion of the range of physically possible intelligence. My view is that we should already have updated strongly away from the 'humans are average' prior as soon as we acquired information about how humans arose -- through evolution, a process that has computational resources and perseverance but no engineering i
This is actually completely untrue and is an example of a typical misconception about programming - which is far closer to engineering than math. Every time you compile a program, you are physically testing a theory exactly equivalent to building and testing a physical machine. Every single time you compile and run a program. If you speedup an AI - by speeding up its mental algorithms or giving it more hardware, you actually slow down the subjective speed of the world and all other software systems in exact proportion. This has enormous consequences - some of which I explored here and here. Human brains operate at 1000hz or or less, which suggests that a near optimal (in terms of raw speed) human-level AGI could run at 1 million X time dilation. However that would effectively mean that the AGI's computers it had access to would be subjectively slower by 1 million times - so if it's compiling code for 10 GHZ CPUs, those subjectively run at 10 kilohertz.


Müller and Bostrom's 2014 'Future progress in artificial intelligence: A survey of expert opinion' surveyed the 100 top-cited living authors in Microsoft Academic Search's "artificial intelligence" category, asking the question:

Define a "high-level machine intelligence" (HLMI) as one that can carry out most human professions at least as well as a typical human. [...] For the purposes of this question, assume that human scientific activity continues without major negative disruption. By what year would you see a (10% / 50% / 90%) probability for such HLMI to exist?

29 of the authors responded. Their median answer was a 10% probability of HLMI by 2024, a 50% probability of HLMI by 2050, and a 90% probability by 2070.

(This excludes how many said "never"; I can't find info on whether any of the authors gave that answer, but in pooled results that also include 141 people from surveys of a "Philosophy and Theory of AI" conference, an "Artificial General Intelligence" conference, an "Impacts and Risks of Artificial General Intelligence" conference, and members of the Greek Association for Artificial Intelligence, 1.2% ... (read more)

Of course, there is a huge problem with expert surveys - at the meta-level they have a very poor predictive track record. There is the famous example that Stuart Russell likes to cite, where rutherford said "anyone who looked for a source of power in the transformation of the atoms was talking moonshine" - a day before leo szilard created a successful fission chain reaction. There is also the similar example of the Wright Brothers - some unknown guys without credentials claim to have cracked aviation when recognized experts like Langley have just failed in a major way and respected scientists such as Lord Kelvin claim the whole thing is impossible. The wright brothers then report their first successful manned flight and no newspaper will even publish it.

Maybe this is the community bias that you were talking to, the over-reliance on abstract thought rather than evidence, projected on an hypothetical future AI.

You nailed it. (Your other points too.)

The claim [is] that a super-intelligent agent would be more efficient at understanding humans that humans would be at understanding it, giving the super-intelligent agent['s] edge over humans.

The problem here is that intelligence is not some linear scale, even general intelligence. We human beings are insanely optimized for social intelligence in a way that is not easy for a machine to learn to replicate, especially without detection. It is possible for a general AI to be powerful enough to provide meaningful acceleration of molecular nanotechnology and medical science research whilst being utterly befuddled by social conventions and generally how humans think, simply because it was not programmed for social intelligence.

Anyway, as much as they exaggerate the magnitude and urgency of the issue, I think that the AI-risk advocates have a point when they claim that keeping a system much intelligent than ourselves under control would be a non-trivial problem.

There is however a subs... (read more)

Agree. I think that since many AI risk advocates have little or no experience in computer science and specifically AI research, they tend to anthropomorphize AI to some extent. They get that an AI could have goals different than human goals but they seem to think that it's intelligence would be more or less like human intelligence, only faster and with more memory. In particular they assume that an AI will easily develop a theory of mind and social intelligence from little human interaction. I think they used to claim that safe AGI was pretty much an impossibility unless they were the ones who built it, so gib monies plox! Anyway, it seems that in recent times they have taken a somewhat less heavy handed approach.

Thanks for taking the time to explain your reasoning, Mark. I'm sorry to hear you won't be continuing the discussion group! Is anyone else here interested in leading that project, out of curiosity? I was getting a lot out of seeing people's reactions.

I think John Maxwell's response to your core argument is a good one. Since we're talking about the Sequences, I'll note that this dilemma is the topic of the Science and Rationality sequence:

In any case, right now you've got people dismissing cryonics out of hand as "not scientific", like it was some kind of pharmaceutical you could easily administer to 1000 patients and see what happened. "Call me when cryonicists actually revive someone," they say; which, as Mike Li observes, is like saying "I refuse to get into this ambulance; call me when it's actually at the hospital". Maybe Martin Gardner warned them against believing in strange things without experimental evidence. So they wait for the definite unmistakable verdict of Science, while their family and friends and 150,000 people per day are dying right now, and might or might not be savable—

—a calculated bet you could only make rationally [i.e., u

... (read more)
No, MWI is not unusually difficult to test. It is untestable.
3Rob Bensinger9y
That's not true. (Or, at best, it's misleading for present purposes.) First, it's important to keep in mind that if MWI is "untestable" relative to non-MWI, then non-MWI is also "untestable" relative to MWI. To use this as an argument against MWI, you'd need to talk specifically about which hypothesis MWI is untestable relative to; and you would then need to cite some other reason to reject MWI (e.g., its complexity relative to the other hypothesis, or its failures relative to some third hypothesis that it is testable relative to). With that in mind: * 1 - MWI is testable insofar as QM itself is testable. We normally ignore this fact because we're presupposing QM, but it's important to keep in mind if we're trying to make a general claim like 'MWI is unscientific because it's untestable and lacks evidential support'. MWI is at least as testable as QM, and has at least as much supporting evidence. * 2 - What I think people really mean to say (or what a steel-manned version of them would say) is that multiverse-style interpretations of QM are untestable relative to each other. This looks likely to be true, for practical purposes, when we're comparing non-collapse interpretations: Bohmian Mechanics doesn't look testable relative to Many Threads, for example. (And therefore Many Threads isn't testable relative to Bohmian Mechanics, either.) (Of course, many of the things we call "Many Worlds" are not fully fleshed out interpretations, so it's a bit risky to make a strong statement right now about what will turn out to be testable in the real world. But this is at least a commonly accepted bit of guesswork on the part of theoretical physicists and philosophers of physics.) * 3 - But, importantly, collapse interpretations generally are empirically distinguishable from non-collapse interpretations. So even though non-collapse interpretations are generally thought to be 'untestable' relative to each other, they are testable relative to collapse interpretations. (An
I don't have an argument against MWI specifically, no. No, that is not how it works: I don't need to either accept or reject MWI. I can also treat it as a causal story lacking empirical content. Nothing wrong with such stories, they are quite helpful for understanding systems. But not a part of science. By that logic, if I invent any crazy hypothesis in addition to an empirically testable theory, then it inherits testability just on those grounds. You can do that with the word "testabiity" if you want, but that seems to be not how people use words. If some smart catholic says that evolution is how God unfolds creation when it comes to living systems, then any specific claims we can empirically check pertaining to evolution (including those that did not pan out, and required repairs of evolutionary theory) also somehow are relevant to the catholic's larger hypothesis? I suppose that is literally true, but silly. There is no empirical content to what this hypothetical catholic is saying, over and above the actual empirical stuff he is latching his baggage onto. I am not super interested in having catholic theologians read about minimum descriptive complexity, and then weaving a yarn about their favorite hypotheses based on that. I like money! I am happy to discuss bet terms on this. Yes if you have an interpretation that gives different predictions than QM, then yes that will render that interpretation falsifiable of course (and indeed some were). That is super boring, though, and not what this argument is about. But also, I don't see what falsifiability of X has to do with falsifiabiliity of Y, if X and Y are different. Newtonian mechanics is both falsifiable and falsified, but that has little to do with falsifiability of any story fully consistent with QM predictions. ---------------------------------------- My personal take on MWI is I want to waste as little energy as possible on it and arguments about it, and actually go read Feynman instead. (This is not
9Rob Bensinger9y
To say that MWI lacks empirical content is also to say that the negation of MWI lacks empirical content. So this doesn't tell us, for example, whether to assign higher probability to MWI or to the disjunction of all non-MWI interpretations. Suppose your ancestors sent out a spaceship eons ago, and by your calculations it recently traveled so far away that no physical process could ever cause you and the spaceship to interact again. If you then want to say that 'the claim the spaceship still exists lacks empirical content,' then OK. But you will also have to say 'the claim the spaceship blipped out of existence when it traveled far enough away lacks empirical content'. And there will still be some probability, given the evidence, that the spaceship did vs. didn't blip out of existence; and just saying 'it lacks empirical content!' will not tell you whether to design future spaceships so that their life support systems keep operating past the point of no return. There's no ambiguity if you clarify whether you're talking about the additional crazy hypothesis, vs. talking about the conjunction 'additional crazy hypothesis + empirically testable theory'. Presumably you're imagining a scenario where the conjunction taken as a whole is testable, though one of the conjuncts is not. So just say that. Sean Carroll summarizes collapse-flavored QM as the conjunction of these five claims: Many-worlds-flavored QM, on the other hand, is the conjunction of 1 and 2, plus the negation of 5 -- i.e., it's an affirmation of wave functions and their dynamics (which effectively all physicists agree about), plus a rejection of the 'collapses' some theorists add to keep the world small and probabilistic. (If you'd like, you could supplement 'not 5' with 'not Bohmian mechanics'; but for present purposes we can mostly lump Bohm in with multiverse interpretations, because Eliezer's blog series is mostly about rejecting collapse rather than about affirming a particular non-collapse view.)
Yes. Right. But I think it's a waste of energy to assign probabilities to assertions lacking empirical content, because you will not be updating anyways, and a prior without possibility of data is just a slightly mathier way to formulate "your taste." I don't argue about taste. One can assume a reasonable model here (e.g. leave a copy of the spaceship in earth orbit, or have it travel in a circle in the solar system, and assume similar degradation of modules). Yes, you will have indirect evidence only applicable due to your model. But I think the model here would have teeth. Or we are thinking about the problem incorrectly and those are not exhaustive/mutually exclusive. Compare: "logically the electron must either be a particle or must not be a particle." I am not explicitly attacking MWI, as I think I said multiple times. I am not even attacking having interpretations or preferring one over another for reasons such as "taste" or "having an easier time thinking about QM." I am attacking the notion that there is anything more to the preference for MWI than this. ---------------------------------------- To summarize my view: "testability" is about "empirical claims," not about "narratives." MWI is, by its very nature, a narrative about empirical claims. The list of empirical claims it is a narrative about can certainly differ from another list of empirical claims with another narrative. For example, we can imagine some sort of "billiard ball Universe" narrative around Newtonian physics. But I would not say "MWI is testable relative to the Newtonian narrative", I would say "the list of empirical claims 'QM' is testable relative to the list of empirical claims 'Newtonian physics." The problem with the former statement is first it is a "type error," and second, there are infinitely many narratives around any list of empirical claims. You may prefer MWI for [reasons] over [infinitely long list of other narratives], but it seems like "argument about taste." What's
0Rob Bensinger8y
Could you restate your response to the spaceship example? This seems to me to be an entirely adequate response to Favoring simpler hypotheses matters, because if you're indifferent to added complexity when it makes no difference to your observations (e.g., 'nothing outside the observable universe exists') you may make bad decisions that impact agents that you could never observe, but that might still live better or worse lives based on what you do. This matters when you're making predictions about agents far from you in space and/or time. MWI is a special case of the same general principle, so it's a useful illustrative example even if it isn't as important as those other belief-in-the-implied-invisible scenarios. Collapse and non-collapse interpretations are empirically distinguishable from each other. I've been defining 'QM' in a way that leaves it indifferent between collapse and non-collapse -- in which case you can't say that the distinction between bare QM and MWI is just a matter of taste, because MWI adds the testable claim that collapse doesn't occur. If you prefer to define 'QM' so that it explicitly rejects collapse, then yes, MWI (or some versions of MWI) is just a particular way of talking about QM, not a distinct theory. But in that case collapse interpretations of QM are incompatible with QM itself, which seems like a less fair-minded way of framing a foundations-of-physics discussion.
You are not engaging with my claim that testability is a property of empirical claims, not narratives. Not sure there is a point to continue until we resolve the disagreement about the possible category error here. ---------------------------------------- There is another weird thing where you think we test claims against other claims, but actually we test against Nature. If Nature says your claim is wrong, it's falsified. If there is a possibility of Nature saying that, it's falsifiable. You don't need a pair of claims here. Testability is not a binary relation between claims. But that's not central to the disagreement.
0Rob Bensinger8y
Why do you think collapse interpretations are 'narratives', and why do you think they aren't empirical claims? Regarding testability: if you treat testability as an intrinsic feature of hypotheses, you risk making the mistake of thinking that if there is no test that would distinguish hypothesis A from hypothesis B, then there must be no test that could distinguish hypothesis A from hypothesis C. It's true that you can just speak of a test that's better predicted by hypothesis 'not-A' than by hypothesis A, but the general lesson that testability can vary based on which possibilities you're comparing is an important one, and directly relevant to the case we're considering.
There are two issues, what I view as non-standard language use, and what I view as a category error. You can use the word 'testability' to signify a binary relation, but that's not what people typically mean when they use that word. They typically mean "possibility Nature can tell you that you are wrong." So when you responded many posts back with a claim "MWI is hard to test" you are using the word "test" in a way probably no one else in the thread is using. You are not wrong, but you will probably miscommunicate. ---------------------------------------- An empirical claim has this form: "if we do experiment A, we will get result B." Nature will sometimes agree, and sometimes not, and give you result C instead. If you have a list of such claims, you can construct a "story" about them, like MWI, or something else. But adding the "story" is an extra step, and what Nature is responding to is not the story but the experiment. The mapping from stories to lists of claims is always always always many to one. If you have [story1] about [list1] and [story2] about [list2], and Nature agrees with [list1], and disagrees with [list2], then you will say: "story1 was falsified, story2 was falsifiable but not falsified." I will say: "list1 was falsified, list2 was falsifiable but not falsified." What's relevant here isn't the details of story1 or story2, but what's in the lists. When I say "MWI is untestable" what I mean is: "There is a list of empirical claims called 'quantum mechanics.' There is a set of stories about this list, one of which is MWI. There is no way to tell these stories apart empirically, so you pick the one you like best for non-empirical reasons." When you say "MWI is testable" what I think you mean is: "There are two lists of empirical claims, called 'quantum mechanics' and 'quantum mechanics prime,' a story 'story 1' about the former, and a story 'story 2' about the latter. Nature will agree with the list 'quantum mechanics' and disagree with th
At one point I started developing a religious RPG character who applied theoretical computer science to his faith. I forget details, but among other details he believed that although the Bible prescribed the best way to live, the world is far too complex for any finite set of written rules to cover every situation. The same limitation applies to human reason: cognitive science and computational complexity theory have shown all the ways in which we are bounded reasoners, and can only ever hope to comprehend a small part of the whole world. Reason works best when it can be applied to constrained problems where clear objective answer can be found, but it easily fails once the number of variables grows. Thus, because science has shown that both the written word of the Bible and human reason are fallible and easily lead us astray (though the word of the Bible is less likely to do so), the rational course of action for one who believes in science is to pray to God for guidance and trust the Holy Spirit to lead us to the right choices.
Plus 6: There is a preferred basis.
In so far as I understand what the "preferred basis problem" is actually supposed to be, the existence of a preferred basis seems to me to be not an assumption necessary for Everettian QM to work but an empirical fact about the world; if it were false then the world would not, as it does, appear broadly classical when one doesn't look too closely. Without a preferred basis, you could still say "the wavefunction just evolves smoothly and there is no collapse"; it would no longer be a useful approximation to describe what happens in terms of "worlds", but for the same reason you could not e.g. adopt a "collapse" interpretation in which everything looks kinda-classical on a human scale apart from random jumps when "observations" or "measurements" happen. The world would look different in the absence of a preferred basis. But I am not very expert on this stuff. Do you think the above is wrong, and if so how?
I think it's being used as an argument against beliefs paying rent. Since there is more than one interpretation of QM, empirically testing QM does not prove any one interpretation over the others. Whatever extra arguments are used to support a particular interpretation over the others are not going to be, and have not been, empirical. No they are not, because of the meaning of the word "interpretation" but collapse theories, such as GRW, might be.
Which is one of the ways in which beliefs that don't pay rent do pay rent.
Yes I'm familiar with the technical agenda. What do you mean by "forecasting work"--AI impacts? That seems to be of near-zero utility to me. What MIRI should be doing, what I've advocated MIRI to do from the start, and which I can't get a straight answer on why they are not doing that does not in some way terminate in referencing the more speculative sections of the sequences I take issue with, is this: build artificial general intelligence and study it. Not a provably-safe-from-first-principles-before-we-touch-a-single-line-of-code AGI. Just a regular, run of the mill AGI using any one of the architectures presently being researched in the artificial intelligence community. Build it and study it.
8Rob Bensinger9y
A few quick concerns: * The closer we get to AGI, the more profitable further improvements in AI capabilities become. This means that the more we move the clock toward AGI, the more likely we are to engender an AI arms race between different nations or institutions, and the more (apparent) incentives there are to cut corners on safety and security. At the same time, AGI is an unusual technology in that it can potentially be used to autonomously improve on our AI designs -- so that the more advanced and autonomous AI becomes, the likelier it is to undergo a speed-up in rates of improvement (and the likelier these improvements are to be opaque to human inspection). Both of these facts could make it difficult to put the brakes on AI progress. * Both of these facts also make it difficult to safely 'box' an AI. First, different groups in an arms race may simply refuse to stop reaping the economic or military/strategic benefits of employing their best AI systems. If there are many different projects that are near or at AGI-level when your own team suddenly stops deploying your AI algorithms and boxes them, it's not clear there is any force on earth that can compel all other projects to freeze their work too, and to observe proper safety protocols. We are terrible at stopping the flow of information, and we have no effective mechanisms in place to internationally halt technological progress on a certain front. It's possible we could get better at this over time, but the sooner we get AGI, the less intervening time we'll have to reform our institutions and scientific protocols. * A second reason speed-ups make it difficult to safely box an AGI is that we may not arrest its self-improvement in the (narrow?) window between 'too dumb to radically improve on our understanding of AGI' and 'too smart to keep in a box'. We can try to measure capability levels, but only using imperfect proxies; there is no actual way to test how hard it would be for an AGI to escape a box bey

Thank you for this.

I see you as highlighting a virtue that the current Art gestures toward but doesn't yet embody. And I agree with you, a mature version of the Art definitely would.

In his Lectures on Physics, Feynman provides a clever argument to show that when the only energy being considered in a system is gravitational potential energy, then the energy is conserved. At the end of that, he adds the following:

It is a very beautiful line of reasoning. The only problem is that perhaps it is not true. (After all, nature does not have to go along with our reasoning.) For example, perhaps perpetual motion is, in fact, possible. Some of the assumptions may be wrong, or we may have made a mistake in reasoning, so it is always necessary to check. It turns out experimentally, in fact, to be true.

This is such a lovely mental movement. Feynman deeply cared about knowing how the world really actually works, and it looks like this led him to a mental reflex where even in cases of enormous cultural confidence he still responds to clever arguments by asking "What does nature have to say?"

In my opinion, people in this community update too much on clever arguments. I include myself ... (read more)

This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.

This may be retreating to the motte's bailey, so to speak, but I don't think anyone seriously thinks that a superintelligence would be literally impossible to understand. The worry is that there will be such a huge gulf between how superin... (read more)

Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page "proof" of a mathematical theorem (i.e., something far longer than any human could check by hand), you're putting a huge amount of trust in the correctness of the theorem-proving-software itself.

No, you just need to trust a proof-checking program, which can be quite small and simple, in contrast with the theorem proving program, which can be arbitrarily complex and obscure.

Isn't using a laptop as a metaphor exactly an example of I think one of the points trying to be made was that because we have this uncertainty about how a superintelligence would work, we can't accurately predict anything without more data. So maybe the next step in AI should be to create an "Aquarium," a self-contained network with no actuators and no way to access the internet, but enough processing power to support a superintelligence. We then observe what that superintelligence does in the aquarium before deciding how to resolve further uncertainties.
There is a difference between argument by analogy and using an example. The relevant difference here is that examples illustrate arguments that are made separately, like how calef spent paragraphs 4 and 5 restating the arguments sans laptop. If anything, the argument from analogy here is in the comparison between human working memory and computer RAM and a nebulous "size in mindspace," because it is used as an important part of the argument but is not supported separately. But don't fall for the fallacy fallacy - just because something isn't modus ponens doesn't mean it can't be Bayesian evidence.
The sentence could have stopped there. If someone makes a claim like "∀ x, p(x)", it is entirely valid to disprove it via "~p(y)", and it is not valid to complain that the first proposition is general but the second is specific. Moving from the general to the specific myself, that laptop example is perfect. It is utterly baffling to me that people can insist we will be able to safely reason about the safety of AGI when we have yet to do so much as produce a consumer operating system that is safe from remote exploits or crashes. Are Microsoft employees uniquely incapable of "fully general intelligent behavior"? Are the OpenSSL developers especially imperfectly "capable of understanding the logical implications of models"? If you argue that it is "nonsense" to believe that humans won't naturally understand the complex things they devise, then that argument fails to predict the present, much less the future. If you argue that it is "nonsense" to believe that humans can't eventually understand the complex things they devise after sufficient time and effort, then that's more defensible, but that argument is pro-FAI-research, not anti-.
Problems with computer operating systems do not do arbitrary things in the absence of someone consciously using the exploit to make it do arbitrary things. If Windows was a metaphor for unfriendly AI, then it would be possible for AIs to halt in situations where they were intended to work, but they would only turn hostile if someone intentionally programmed them to become hostile. Unfriendly AI as discussed here is not someone intentionally programming the AI to become hostile.
Precisely correct, thank you for catching that. Also correct reading of my intent. The "aquarium" ides is basically what I have and would continue to advocate for: continue developing AGI technology within the confines of a safe experimental setup. By learning more about the types of programs which can perform limited general intelligence tasks in sandbox environments, we learn more about their various strengths and limitations in context, and from that experiance we can construct suitable safeguards for larger deployments.
That may be a valid concern, but it requires evidence as it is not the default conclusion. Note that quantum physics is sufficiently different that human intuitions do not apply, but it does not take a physicist a “prohibitively long” time to understand quantum mechanical problems and their solutions. As to your laptop example, I'm not sure what you are attempting to prove. Even if one single engineer doesn't understand how ever component of a laptop works, we are nevertheless very much able to reason about the systems-level operation of laptops, or the the development trajectory of the global laptop market. When there are issues, we are able to debug them and fix them in context. If anything the example shows how humanity as a whole is able to complete complex projects like the creation of a modern computational machine without being constrained to any one individual understanding the whole. Edit: gaaaah. Thanks Sable. I fell for the very trap of reasoning by analogy I opined against. Habitual modes of thought are hard to break.
As far as I can tell, you're responding to the claim, "A group of humans can't figure out complicated ideas given enough time." But this isn't my claim at all. My claim is, "One or many superintelligences would be difficult to predict/model/understand because they have a fundamentally more powerful way to reason about reality." This is trivially true once the number of machines which are "smarter" than humans exceeds the total number of humans. The extent to which it is difficult to predict/model the "smarter" machines is a matter of contention. The precise number of "smarter" machines and how much "smarter" they need be before we should be "worried" is also a matter of contention. (How "worried" we should be is a matter of contention!) But all of these points of contention are exactly the sorts of things that people at MIRI like to think about.
Whatever reasoning technique is available to a super-intelligence is available to humans as well. No one is mandating that humans who build an AGI check their work with pencil and paper.
I mean, sure, but this observation (i.e., "We have tools that allow us to study the AI") is only helpful if your reasoning techniques allow you to keep the AI in the box. Which is, like, the entire point of contention, here (i.e., whether or not this can be done safely a priori). I think that you think MIRI's claim is "This cannot be done safely." And I think your claim is "This obviously can be done safely" or perhaps "The onus is on MIRI to prove that this cannot be done safely." But, again, MIRI's whole mission is to figure out the extent to which this can be done safely.
It's completely divorced from reality: How? By what mechanism? An artificial intelligence is not a magical oracle. It arrives at its own plan of action by some deterministic algorithm running on the data available to it. An intelligence that is not programmed for social awareness will not suddenly be able to outsmart, outthink, and outmaneuver its human caretakers the moment it crosses some takeoff boundary. Without being programmed to have such capability from the start, and without doing something stupid like connecting it directly to the Internet, how is an AI supposed to develop that capability on its own without a detectable process of data collection by trial and error? Nobody gets a free card to say “the AGI will simply outsmart people in every way.” You have to explain precisely how such capability would exist. So far, all that I've seen is unclear, hand-wavy arguments by analogy that are completely unsatisfactory in that regard. "Because super-intelligence!" is not an answer. We could, I don't know, pull the plug. How, unless it is given effectors in the real world? Why would we be stupid enough to do that? If we started with an ability to control it, how did we lose that ability? Turn it off. Take as long as you want to evaluate the data and make your decision. Then turn it back on again. Or not.

"Why would we be stupid enough to do that.?" For the same reason we give automatic trading software "effectors" to make trades on real world markets. For the same reason we have robot arms in factories assembling cars. For the same reason Google connects its machine learning algorithms directly to the internet. BECAUSE IT IS PROFITABLE.

People don't want to build an AI just to keep it in a box. People want AI to do stuff for them, and in order for it to be profitable, they will want AI to do stuff faster and more effectively than a human. If it's not worrying to you because you think people will be cautious, and not give the AI any ability to affect the world, and be instantly ready to turn off their multi-billion dollar research project, you are ASSUMING a level of caution that MIRI is trying to promote! You've already bought their argument, and think everyone else has!

What type of evidence would make you think it's more likely that a self-modifying AGI could "break out of the box"? I want to understand your resistance to thought-experiments. Are all thought-experiments detached from reality in your book? Are all analogies detached from reality? Would you ever feel like you understood something better and thus change your views because of an analogy or story? How could something like be different in a way that you would theoretically find persuasive? Perhaps you're saying that people's confidence is too strong just based on analogy? I was going to try and address your comment directly, but thought it'd be a good idea to sort that out first, because of course there are no studies of how AGI behave.
What lesson am I supposed to learn from "That Alien Message"? It's a work of fiction. You do not generalize from fictional evidence. Maybe I should write a story about how slow a takeoff would be given the massive inefficiencies of present technology, all the trivial and mundane ways an AI in the midst of a takeoff would get tripped up and caught, and all the different ways redundant detection mechanisms, honey pots, and fail safe contraptions would prevent existential risk scenarios? But such a work of fiction would be just as invalid as evidence.
Ok, I'm still confused as to many of my questions, but let me see if this bit sounds right: the only parameter via which something like "That Alien Message" could become more persuasive to you is by being less fictional. Fictional accounts of anything will NEVER cause you to update your beliefs. Does that sound right? If that's right, then I want to suggest why such things should sometimes be persuasive. A perfect Bayesian reasoner with finite computational ability operates not just with uncertainty about the outside world, but also with logical uncertainty as to consequences of their beliefs. So as humans, we operate with at least that difficulty when dealing with our own beliefs. In practice we deal with much much much worse. I believe the correct form of the deduction your trying to make is "don't add a fictional story to the reference class of a real analogue for purposes of figuring your beliefs", and I agree. However, there are other ways a fictional story can be persuasive and should (in my view) cause you to update your beliefs: * It illustrates a new correct deduction which you weren't aware of before, whose consequences you then begin working out. * It reminds you of a real experience you've had, which was not present in your mind before, whose existence then figures into your reference classes. * It changes your emotional attitude toward something, indirectly changing your beliefs by causing you to reflect on that thing differently in the future. Some of these are subject to biases which would need correcting to move toward better reasoning, but I perceive you as claiming that these should have no impact, ever. Am I interpreting that correctly(I'm going to guess that I'm not somewhere), and if so why do you think that?
I think it's a pretty big assumption to assume that fictional stories typically do those things correctly. Fictional stories are, after all, produced by people with agendas. If the proportion of fictional stories with plausible but incorrect deductions, reminders, or reflections is big enough, even your ability to figure out which ones are correct might not make it worthwhile to use fiction this way. (Consider an extreme case where you can correctly assess 95% of the time whether a fictional deduction, reminder, or reflection is correct, but they are incorrect at a 99% rate. You'd have about a 4/5 chance of being wrong if you update based on fiction.)
Agreed; you'd have to figure all of that out separately. For what it's worth, given the selection of fictional stories I'm usually exposed to and decide to read, I think they're generally positive value (though probably not the best in terms of opportunity cost.)
If a story or thought experiment prompts you to think of some existing data you hadn't paid attention to, and realize that data was not anticipated by your present beliefs, then that data acts as evidence for updating beliefs. The story or thought experiment was merely a reference used to call attention to that data. "Changing your emotional attitude" as far as I can tell is actually cache-flushing. It does not change your underlying beliefs, it just trains your emotional response to align with those beliefs, eliminating inconsistencies in thought. I'm not sure where "that alien message" is supposed to lie in either of those two categories. It makes no reference to actual experimental data which I may not have been paying attention to, nor do I detect any inconsistency it is unraveling. Rather, it makes a ton of assumptions and then runs with those assumptions, when in fact those assumptions were not valid in the first place. It's a cathedral built on sand.
Basically agreed on paragraph 1, but I do want to suggest that then we not say "I will never update on fictional stories." Taken naively, you then might avoid fictional stories because they're useless ("I never update on them!"), when of course they might be super useful if they cause you to pull up relevant experiences quite often. I'll make an example of how "That Alien Message" could do for me what I illustrated in my 1st bullet point. I think, "Oh, it seems very unlikely that an AI could break out of a box, you just have this shutoff switch and watch it closely and ...". Then That Alien Message suggests the thought experiment of "instead of generalizing over all AI, instead imagine just a highly specific type of AI that may or may not reasonably come to exist: a bunch of really sped up, smart humans." Then it sparks my thinking for what a bunch of really sped up, smart humans could accomplish with even narrow channels of communication. Then I think "actually, though I've seen no new instances of the AI reference class in reality, I now reason differently about how a possible AI could behave since that class (of possibilities) includes the the thing I just thought of. Until I get a lot more information about how that class behaves in reality, I'm going to be a lot more cautious." By picking out a specific possible example, it illustrates that my thinking around the possible AI reference class wasn't expansive enough. This could help break through, for example, an accessibility heuristic: when I think of a random AI, I think of my very concrete vision of how such an AI would behave, instead of really trying to think about what could lie in that huge space. Perhaps you are already appropriately cautious, and this story sparks/sparked no new thoughts in you, or you have a good reason to believe that communities of sped up humans or anything at least as powerful are excluded from the reference space, or the reference space you care about is narrower, but it seemed
Creative / clever thinking is good. It's where new ideas come from. Practicing creative thinking by reading interesting stories is not a waste of time. Updating based on creative / clever thoughts, on the other hand, is a mistake. The one almost-exception I can think of is "X is impossible!" where a clever plan for doing X, even if not actually implemented, suffices as weak evidence that "X is impossible!" is false. Or rather, it should propagate up your belief hierarchy and make you revisit why you thought X was impossible in the first place. Because the two remaining options are: (1) you were mistaken about the plausibility of X, or (2) this clever new hypothetical is not so clever--it rests on hidden assumptions that turn out to be false. Either way you are stuck testing your own assumptions and/or the hypothesis' assumptions before making that ruling. The trouble is, most people tend to just assume (1). I don't know if there is a name for this heuristic, but it does lead to bias.
Your arguments rests on trying to be clever, which Mark rejected as a means of gathering knowledge. Do you have empiric evidence that there are cases where people did well by updating after reading fictional stories? Are there any studies that suggest that people who update through fictional stories do better?
This seems promising! Studies, no. I can't imagine studies existing today that resolve this, which of course is a huge failure of imagination: that's a really good thing to think about. For anything high enough level, I expect to run into problems with "do better", such as "do better at predicting the behavior of AGI" being an accessible category. I would be very excited if there were nearby categories that we could get our hands on though; I expect this is similar to the problem of developing and testing a notion of "Rationality quotient" and proving it's effectiveness. I'm not sure where you're referring to with Mark rejecting cleverness as a way of gathering knowledge, but I think we may be arguing about what the human equivalent of logical uncertainty looks like? What's the difference in this case between "cleverness" and "thinking"? (Also could you point me to the place you were talking about?) I guess I usually think of cleverness with the negative connotation being "thinking in too much detail with a blind spot". So could you say which part of my response you think is bad thinking, or what you instead mean by cleverness?
It's detached from empirical observation. It rests on the assumption that one can gather knowledge by reasoning itself (i.e. being clever).
I see. I do think you can update based on thinking; the human analogue of the logical uncertainty I was talking about. As an aspiring mathematician, this is what I think the practice of mathematics looks, for instance. I understand the objection that this process may fail in real life, or lead to worse outcomes, since our models aren't purely formal and our reasoning isn't either purely deductive or optimally Bayesian. It looks like some others have made some great comments to this article also discussing that. I'm just confused about which thinking you're considering bad. I'm sure I'm not understanding, because it sounds to me like "the thinking which is thinking, and not direct empirical observation." There's got to be some level above direct empirical observation, or you're just an observational rock. The internal process you have which at any level approximates Bayesian reasoning is a combination of your unconscious processing and your conscious thinking. I'm used to people picking apart arguments. I'm used to heuristics that say, "hey you've gone too far with abstract thinking here, and here's an empirical way to settle it, or here's an argument for why your abstract thinking has gone too far and you should wait for empirical evidence or do X to seek some out" But I'm not used to "your mistake was abstract thinking at all; you can do nothing but empirically observe to gain a new state of understanding", at least with regard to things like this. I feel like I'm caricaturing, but there's a big blank when I try to figure out what else is being said.
There are two ways you can to do reasoning. 1) You build a theory about Bayesian updating and how it should work. 2) You run studies of how humans reasons and when they reason successfully. You identify when and how human reason correctly. If I would argue that taking a specific drug helps you with illness X, the only argument you would likely accept is an empiric study. That's independent with whether or not you can find a flaw in casual reason of why I think drug X should help with an illness. At least if you believe in evidence-based medicine. The reason is that in the past theory based arguments often turned out to be wrong in the field of medicine. We don't live in a time where we don't have anyone doing decision science. Whether or not people are simply blinded by fiction or whether it helps reasoning is an empirical question.
Ok, I think see the core of what you're talking about, especially "Whether or not people are simply blinded by fiction or whether it helps reasoning is an empirical question." This sounds like an outside view versus inside view distinction: I've been focused on "What should my inside view look like" and using outside view tools to modify that when possible (such as knowledge of a bias from decision science.) I think you and maybe Mark are trying to say "the inside view is useless or counter-productive here; only the outside view will be of any use" so that in the absence of outside view evidence, we should simply not attempt to reason further unless it's a super-clear case, like Mark illustrates in his other comment. My intuition is that this is incorrect, but it reminds me of the Hanson-Yudkowsky debates on outside vs. weak inside view, and I think I don't have a strong enough grasp to clarify my intuition sufficiently right now. I'm going to try and pay serious attention to this issue in the future though, and would appreciate if you have any references that you think might clarify.
It's not only outside vs. inside view. It's knowing things is really hard. Humans are by nature overconfident. Life isn't fair. The fact that empiric evidence is hard to get doesn't make theoretical reasoning about the issue any more likely to be correct. I rather trust a doctor with medical experience (has an inside view) to translate empirical studies in a way that applies directly to me than someone who reasons simply based on reading the study and who has no medical experience. I do sin from time from time and act overconfident. But that doesn't mean it's right. Skepticism is a virtue. I like Foersters book "Truth is the invention of a liar" (unfortunately that book is in German, and I haven't read other writing by him). It doesn't really gives answers but it makes the unknowing more graspable.
It'd be invalid as evidence, but it might still give a felt sense of your ideas that helps appreciate them. Discussions of AI, like discussions of aliens, have always been drawing on fiction at least for illustration. I for one would love to see that story.
I t occurs to me that an AI could be smart enough to win without being smarter in every way or winning every conflict. Admittedly, this is a less dramatic claim.
Or it could just not care to win in the first place.
It's true by definition that a superintelligent AI will be able to outsmart humans at some things, so I guess you are objecting to the "every way"... "Please dont unplug me, I am about to find a cure for cancer" MIRI has a selection of arguments for how an AI could un box itself, and they are based in the AIs knowledge of human language, values and psychology. Whether it could outsmart us in every way isnt relevant....what is relevant is whether it has those kinds of knowledge, There are ways in which an AI outside of a box could get hold of those kinds of knowledge...but only if it is already unboxed. Otherwise it has chicken and egg problem, the problem of getting enough social engineering knowledge whilst inside the box to talk it's way out....and it is knowledge, not the sort of thing you can figure out from first principles. MIRI seems to think it is likely that a super AI would be preloaded with knowledge of human values because we would want it to agentively make the world a better other words, the worst case scenario is very close to the best case scenario, is a near miss from the best case scenario. And the whole problem is easily .sidestepped by aiming a bit lower, eg for tool AI. Pure information can be dangerous. Consider an AI that generates a formula for a vaccine which is supposed to protect against cancer, but actually makes everyone sterile...
If that happened I would eat my hat. Rush to push the big red STOP button, pull the hard-cutoff electrical lever, break glass on the case containing firearms & explosives, and then sit down and eat my hat. Maybe I'm off here, but common sense tells me that if you are worried about AI takeoffs, and if you are tasking an AI with obscure technical problems like designing construction processes for first generation nanomachines, large scale data mining in support of the SENS research objectives, or plain old long-term financial projections, you don't build in skills or knowledge that is not required. Such a machine does not need a theory of other minds. It may need to parse and understand scientific literature, but has no need to understand the social cues of persuasive language, our ethical value systems, or psychology. It certainly doesn't need a camera pointed at me as would be required to even know I'm about to pull the plug. Of course I'm saying the same thing you are in different words. I know you and I basically see eye-to-eye on this. This could happen by accident. Any change to the human body has side effects. The quest for finding a clinical treatment is locating an intervention whose side effects are a net benefit, which requires at least some understanding of quality of life. It could even be a desirable outcome, vs the null result. I would gladly have a vaccine that protects against cancer but actually makes the patient sterile. Just freeze their eggs, or give it to people over 45 with family history electively. What's crazy is the notion that the machine lacking any knowledge about humans mentioned above could purposefully engineer such an elaborate deception to achieve a hidden purpose, all while its programmers are overseeing its goal system looking for the patterns of deceptive goal states. It'd have to be not just deceptive, but meta-deceptive. At some point the problem is just so over-constrained as to be on the level of Boltzmann-brain improbable
And then be reviled as the man who prevented a cure for cancer? Remember that the you in the story doesn't have the same information as the you outside the story -- he doesn't know that the AI isnt sincere. "Please dont unplug me, I am about to find a cure for cancer" is a .placeholder for a class of exploits on the part of the AI where it holds a carrot in front of us. It's not going to literally come out with the cure for cancer thing under circumstances where it's not tasked with working on something like it it, because that would be dumb , and it's supposed to be superintelligent. But superintelligence is really difficult to have to imagine exploits, then imagine versions of them that are much better. The hypothetical MIRI is putting forward is that if you task an super AI with agentively solving the whole of human happiness, then it will have to have the kind of social, psychological and linguistic knowledge necessary to talk its way out of the box. A more specialised AGI seems safer... and likelier ... but then another danger kicks in: it's creators might be too relaxed about boxing it, perhaps allowing it to internet access... but the internet contains a wealth of information to bootstrap linguistic and psychological knowledge with. There's an important difference between rejecting MIRIs hypotheticals because the conclusions don't follow from the antecedents, as opposed to doing so because the antecedents are unlikely in the place. Dangers arising from non AI scenario don't prove AI safety. My point was that an AI doesn't need efffectors to be dangerous... information plus sloppy oversight is enough. However the MIRI scenario seems to require a kind of perfect storm of fast takeoff , overambition, poor oversight, etc. A superintelligence can be meta deceptive. Direct inspection of code is a terrible method of oversight, since even simple AIs can work in ways that baffle human programmers. ETA on the whole, I object to the antecedents/pri

If you'll permit a restatement... it sounds like you surveyed the verbal output of the big names in the transhumanist/singularity space and classified them in terms of seeming basically "correct" or "mistaken".

Two distinguishing features seemed to you to be associated with being mistaken: (1) a reliance on philosophy-like thought experiments rather than empiricism and (2) relatedness to the LW/MIRI cultural subspace.

Then you inferred the existence of an essential tendency to "thought-experiments over empiricism" as a difficult to change hidden variable which accounted for many intellectual surface traits.

Then you inferred that this essence was (1) culturally transmissible, (2) sourced in the texts of LW's founding (which you have recently been reading very attentively), and (3) an active cause of ongoing mistakenness.

Based on this, you decided to avoid the continued influence of this hypothetical pernicious cultural transmission and therefore you're going to start avoiding LW and stop reading the founding texts.

Also, if the causal model here is accurate... you presumably consider it a public service to point out what is going on and help others avoid... (read more)

This is certainly not correct. It was more like "providing evidence-based justifications for their beliefs, or hand-waving arguments." I'm not commenting on the truth of their claims, just the reported evidence supporting them. I don't association with LW/MIRI cultural subspace is a bad thing. I'm a cryonicist and that is definitely also very much in line with LW/MIRI cultural norms. There are just particular positions espoused by MIRI and commonly held in this community which I believe to be both incorrect and harmful. Other than the above, yes, I believe you have summarized correctly. It isn't LW-specific. The problem lies actually with non-analytic, or at the very least casual philosophy, which unfortunately has become the norm for debate here, thought it is not at all confined to LW. Reasoning by means of loose analogies and thought experiments is dangerous, in no small part because it is designed to trigger heuristics of proof-by-comparison which is in fact an insufficient condition to change one's mind. Finding a thought experiment or analogous situation can give credibility to a theory. In the best case it takes a theory from possible to plausible. However there is a gulf from plausible to probable and/or correct which we must be careful not to cross without proper evidence. The solution to this, in the sciences, is rigorous peer review. I am honestly not sure how that transfers to the forum / communal blog format.

This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour

While it's true that humans are Turing complete, there does not exist only computability as a barrier to understanding.
Brains are, compared to some computers, quite slow and imperfect at storage. Let's say that the output of a super-intelligence would require to be understood, in human terms, the effort of a thousand-years-long computation written with the aid of a billion sheets of paper. While it... (read more)


We get instead definitive conclusions drawn from thought experiments only.

As a relatively new user here at LessWrong (and new to rationality) it is also curious to me that many here point me to articles written by Eliezer Yudkowsky to support their arguments. I have the feeling there is a general admiration for him and that some could be biased by that rather than approaching the different topics objectively.

Also, when I read the article about dissolving problems and how algorithms feel I didn't find any evidence that it is known exactly how neuron netw... (read more)

it is also curious to me that many here point me to articles written by Eliezer Yudkowsky to support their arguments

It's been my experience that this is usually done to point to a longer and better-argued version of what the person wants to say rather than to say "here is proof of what I want to say".

I mean, if I agree with the argument made by EY about some subject, and EY has done a lot of work in making the argument, then I'm not going to just reword the argument, I'm just going to post a link.

The appropriate response is to engage the argument made in the EY argument as if it is the argument the person is making themselves.

Kind of. But its still done when there are unanswered criticisms in the comments section.

I'd like to offer some responses.

[one argument of people who think that the superintelligence alginment problem is incredibly important] is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence... The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect

... (read more)
I am familiar with MIRI's technical agenda and I stand by my words. The work MIRI is choosing for itself is self-isolating and not relevant to the problems at hand in practical AGI work.
AFAIK, part of why the technical agenda contains the questions it does is that they're problems that are of interest to people to mathematicians and logicians even if those people aren't interested in AI risk. (Though of course, that doesn't mean that AI researchers would be interested in that work, but it's at least still more connecting with the academic community than "self-isolating" would imply.)
This is concerning if true - the goal of the technical agenda should be to solve AI risk, not appeal to mathematicians and logicians (by say making them feel important).
That sounds like an odd position to me. IMO, getting as many academics from other fields as possible working on the problems is essential if one wants to make maximal progress on them.
The academic field which is most conspicuously missing is artificial intelligence. I agree with Jacob that it is and should be concerning that the machine intelligence research institute has adopted a technical agenda which is non-inclusive of machine intelligence researchers.
That depends on whether you believe that machine intelligence researchers are the people who are currently the most likely to produce valuable progress on the relevant research questions. One can reasonably disagree on MIRI's current choices about their research program, but I certainly don't think that their choices are concerning in the sense of suggesting irrationality on their part. (Rather the choices only suggest differing empirical beliefs which are arguable, but still well within the range of non-insane beliefs.)
On the contrary, my core thesis is that AI risk advocates are being irrational. It's implied in the title of the post ;) Specifically I think they are arriving at their beliefs via philosophical arguments about the nature of intelligence which are severely lacking in empirical data, and then further shooting themselves in the foot by rationalizing reasons to not pursue empirical tests. Taking a belief without evidence, and then refusing to test that belief empirically--I'm willing to call a spade a spade: that is most certainly irrational.
That's a good summary of your post. I largely agree, but to be fair we should consider that MIRI started working on AI safety theory long before the technology required for practical experimentation with human-level AGI - to do that you need to be close to AGI in the first place. Now that we are getting closer, the argument for prioritizing experiments over theory becomes stronger.
There are many types of academics - does your argument extend to french literature experts? Clearly, if there is a goal behind the technical agenda, changing the technical agenda to appeal to certain groups detracts from that goal. You could argue that enlisting the help of mathematicians and logicians is so important it justifies changing the agenda ... but I doubt there is much historical support for such a strategy. I suspect part of the problem is that the types of researchers/academics which could most help (machine learning, statistics, comp sci types) are far too valuable to industry and thus are too expensive for non-profits such as MIRI.
Well, if MIRI happened to know of technical problems they thought were relevant for AI safety and which they thought French literature experts could usefully contribute to, sure. I'm not suggesting that they would have taken otherwise uninteresting problems and written those up simply because they might be of interest to mathematicians. Rather my understanding is that they had a set of problems that seemed about equally important, and then from that set, used "which ones could we best recruit outsiders to help with" as an additional criteria. (Though I wasn't there, so anything I say about this is at best a combination of hearsay and informed speculation.)

Thank you for the shout out to SENS. This is where I send my charitable donations as well, and it's good to see someone else who thinks they're as effective as I do.


:( Bye, thanks for your reading group. I really appreciated it and your summaries.

What were some of the errors you found in the sequences? Was it mostly AI stuff?

Even though I'm two-thirds through the sequences and have been around the site for two months now, I still don't really understand AI/haven't been convinced to donate to something other than givewell's charities. I feel like when I finally understand it, I probably will switch over to existential risk reduction charities though, so thanks for your thoughts there. I might have figured it was safe to assume MIRI was the best without really thinking for myself, just because I generally feel like the people here are smart and put a lot of thought into things.

I'm relatively new here, so I have trouble seeing the same kinds of problems you do.

However, I can say that LessWrong does help me remember to apply the principles of rationality I've been trying to learn.

I'd also like to add that - much like writing a novel - the first draft rarely addresses all of the possible faults. LessWrong is one of (if not the first) community blogs devoted to "refining the art of human rationality." Of course we're going to get some things wrong.

What I really admire about this site, though, is that contrarian viewpoint... (read more)

There's need for a rationality community. It just seems to me and with regard to certain issues that this one is a stick in the mud :\ Maybe a wiki would be a better structure than a communal blog.

Am I safe if I just maintain a level of skepticism in the presence of thought-experimental "evidence" for a conclusion?

You've mentioned the role of the Sequences in reference to teaching specific conclusions about things like AI, altruism, cryonics, I presume; that's a minority of the whole (in my reading, so far). Would you dispute them for their use in identifying reasoning flaws?

EDIT: I do appreciate your going further in your criticism of the LW mainstream than most (though I've appreciated that of others also). I take it as an invitation to greater care and skepticism.

Of course, you're familiar with "Evaporative Cooling of Group Beliefs," right?

Evaporative cooling is a concern, but it seems to me that it's a more directly relevant concern for moderators or other types of organizers than for individuals. For it to be driving a decision to leave or not to leave, you effectively have to value the group's well-being over your own, which is unlikely for a group you're considering leaving in the first place.

So if I understand correctly, you're leaving LW because you think LW is too hostile to AGI research and nanotechnology? I don't mind your decision to leave, but I'm not sure why you think this. My own impression is that a lot of us either feel unqualified to have an opinion or don't think AGI is likely any time soon.

I think you're way off if you believe that MIRI or LW are slowing down AI progress. I don't think MIRI/LW have that much reach, and AGI is likely decades away in any case. In fact, I don't even know of any work that MIRI has even argued should be stopped, let alone work they successfully stopped.

I recommend this for Main.

mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models

While there are quite a few exceptions, most actual philosophy is not done through metaphors and analogies. Some people may attempt to explain philosophy that way, while others with a casual interest in philosophy might not known the difference, but few actual philosophers I've met are silly enough not to know an analogy is an analogy. Philosophy and empirical sci... (read more)

Hi Mark,

Thanks for your well-considered post. Your departure will be a loss for the community, and sorry to see you go.

I also feel that some of the criticism you're posting here might be due to a misunderstanding, mainly regarding the validity of thought experiments, and of reasoning by analogy. I think both of these have a valid place in rational thought, and have generally been used appropriately in the material you're referring to. I'll make an attempt below to elaborate.

Reasoning by analogy, or, the outside view

What you call "reasoning by analogy&... (read more)


I think it is possible to use LW for generating testable hypotheses, though sadly testing would require lots of resources, but then it is usually so anyway. For example, I tried to see how LWers would estimate probabilities of statements for botanical questions, and there was even one volunteer. Well. Perhaps it would be more in-group to ask for probabilities for technical stuff - not AI or math, rather something broadly engineering that would still allow people to generate more than one alternative - and watch how they connect the dots and make assumption... (read more)

I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.

It seems to me that CFAR engages into empiricism. They are trying to teach various different ways to make people more rational and they are willing to listen to the results and change teaching content and methods.

Is your main objection against them that till now they haven't published any papers?

How do they measure whether they are actually making people more rational? There are hundreds, maybe thousands self-help/personal development groups in the world. From the secular ones (e.g. Landmark, which is in some ways the spiritual ancestor of CFAR), to traditional religions, to mumbo jumbo new age stuff. From the "outside view", how can I distinguish CFAR from these ones?
I think they did a bit of polling but I'm no good person to speak about the details. I haven't attended one of their courses yet. A core difference is that CFAR intends to publish papers in the future that show effectiveness of techniques. The others that you listed don't. I also understand that doing work to getting techniques into a form were you can test them well in a study takes time. In a lot of New Age frameworks there the belief that everything happens as it's supposed to be. If a person get's ill the day after a workshop, it's because they are processing negative emotions or karma. You can't do any science when you assume that any possible outcome of an experiment is by definition a good outcome and the challenge is about trusting that it's a good outcome. The importance of trusting the process is also a core feature of traditional religion. If your prayer doesn't seem to be working, it's just because you don't understand how god moves in mysterious ways. Trust isn't inherently bad but it prevents scientific learning. I don't know Landmarks position on trust and skepticism. Landmark does practices like creating an expectation that participants invite guests to the Evening Session that I'm uncomfortable with. The might be effective recruiting tools but they feel culty.
I think it is. CFAR could be just a more sophisticated type of mumbo jumbo tailored to appeal materialists. Just because they are not talking about gods or universal quantum consciousness it doesn't mean that their approach is any more grounded in evidence. Maybe it is, but I would like to see some replicable study about it. I'm not going to give them a free pass because they display the correct tribal insignia.
This comment confuses the heck out of me. Of course that's not why their approach is any more grounded in evidence, the fact that the particular brand of self-help they're peddling is specifically about rationality which is itself to a large degree about what is grounded in evidence is why the methods they're promoting are grounded in evidence. What are you looking for, exactly? The cognitive biases that CFAR teaches about and tries to help mitigate have been well known in social psychology and cognitive science for decades. If you're worried that trying to tackle these kinds of biases could actually makes things worse, then yes, we know that knowing about biases can hurt people, and obviously that's something CFAR tries to avoid. If you're worried that trying to improve rationality isn't actually a particularly effective form of self-help, then yes, we know that extreme rationality isn't that great, and trying to find improvements that result in real practical benefits is part of what CFAR tries to do. But the fact that they're specifically approaching areas of self-improvement that come from well documented to be genuinely real-world phenomena like cognitive biases makes them clearly significantly different to those that are centred around less sound ideas, for example, gods or universal quantum consciousness. Or at least, I would have thought so anyway.
I think what V_V is saying is: show me the evidence. Show me the evidence that CFAR workshop participants make better decisions after the workshop than they do before. Show me the evidence that the impact of CFAR instruction has higher expected humanitarian benefit dollar-for-dollar than an equivalent donation to SENS, or pick-your-favorite-charity.

Show me the evidence that the impact of CFAR instruction has higher expected humanitarian benefit dollar-for-dollar than an equivalent donation to SENS, or pick-your-favorite-charity.

I don't think they do, but I don't think we were comparing CFAR to SENS or other effective altruist endorsed charities, I was contesting the claim that CFAR was comparable to religions and mumbo jumbo:

Is it correct to compare CFAR with religions and mumbo jumbo?

I think it is.

I mean, they're literally basing their curricula on cognitive science. If you look at their FAQ, they give examples of the kinds of scientifically grounded, evidence based methods they use for improving rationality:

While research on cognitive biases has been booming for decades, we’ve spent more time identifying biases than coming up with ways to evade them.

There are a handful of simple techniques that have been repeatedly shown to help people make better decisions. “Consider the opposite” is a name for the habit of asking oneself, “Is there any reason why my initial view might be wrong?” That simple, general habit has been shown to be useful in combating a wide variety of biases, including overconfidence, hindsight bia

... (read more)
It's also well known in cognitive science that mitigating biases is hard. Having studies that prove that CFAR interventions work is important for the long term. Keith Stanovich (who's a CFAR advisor) got a million dollar (or $999,376 to be exact) from the John Templeton Foundation to create a rationality quotient test that measures rationality in the same way we have tests for IQ. If CFAR workshops work to increase the rationality of their participants, that score should go up. The fact that you try doesn't mean that you succeed and various people in the personal development field also try to find improvements that result in real practical benefits. In Willpower psychology professor Roy Baumeister makes the argument that the God idea is useful for raising Willpower. Mormons have been found to look healtier.
That is an extremely misleading sentence. CFAR cannot give Stanovich's test to their students because the test does not yet exist.
It's only misleading if you take it out of context. I do argue in this thread that CFAR is a promising organisation. I didn't say that CFAR is bad because they haven't provided this proof. I wanted to illustrate that meaningful proof of effectiveness is possible and should happen in the next years. The fact that CFAR is unable to do this till now because of unavailability of the test doesn't mean that there's proof that CFAR manages to raise rationality. I also don't know the exact relationship between Stanovich and CFAR and to what extend his involvement in CFAR is more than having his name on the CFAR advisor page. Giving CFAR participants a bunch of questions that he considers to be potentially usefully for measuring rationality could be part of his effort to develop a rationality test. The text being publically available isn't a necessary condition for a version of the text being used inside CFAR.
So I know less about CFAR than I do the other two sponsors of the site. There is in my mind some unfortunate guilt by association given partially shared leadership & advisor structure, but I would not want to unfairly prejudice an organization for that reason alone. However there are some worrying signs which is why I feel justified in saying something at least in a comment, with the hope that someone might prove me wrong. CFAR used donated funds to pay for Yudkowsky's time in writing HPMoR. It is an enjoyable piece of fiction, and I do not object to the reasoning they gave for funding his writing. But it is a piece of fiction whose main illustrative character suffers from exactly the flaw that I talked about above, in spades. It is my understanding also that Yudkowsky is working on a rationality textbook for release by CFAR (not the sequences which was released by MIRI). I have not seen any draft of this work, but Yudkowsky is currently 0 for 2 on this issue, so I'm not holding my breath. And given that further donations to CFAR are likely to pay for the completion of this work which has a cringe-inducing Bayesian prior, I would be hesitant to endorse them. That and, as you said, publications have been sparse or non-existent. But I know very little other than that about CFAR and I remain open to having my mind changed.

As Chief Financial Officer for CFAR, I can say all the following with some authority:

CFAR used donated funds to pay for Yudkowsky's time in writing HPMoR.

Absolutely false. To my knowledge we have never paid Eliezer anything. Our records indicate that he has never been an employee or contractor for us, and that matches my memory. I don't know for sure how he earned a living while writing HPMOR, but at a guess it was as an employed researcher for MIRI.

It is my understanding also that Yudkowsky is working on a rationality textbook for release by CFAR (not the sequences which was released by MIRI).

I'm not aware of whether Eliezer is writing a rationality textbook. If he is, it's definitely not with any agreement on CFAR's part to release it, and we're definitely not paying him right now whether he's working on a textbook or not.

And given that further donations to CFAR are likely to pay for the completion of this work…

Not a single penny of CFAR donations go into paying Eliezer.

I cannot with authority promise that will never happen. I want to be clear that I'm making no such promise on CFAR's behalf.

But we have no plans to pay him for anything to the best of my knowledge as the person in charge of CFAR's books and financial matters.

Thank you for correcting me on this. So the source of the confusion is the Author's notes to HPMoR. Eliezer promotes both CFAR and MIRI workshops and donation drives, and is ambiguous about his full employment status--it's clear that he's a researcher at MIRI, but if was ever explicitly mentioned who was paying for his rationality work, I missed it. Googling "CFAR" does show that on, a page I never read he discloses not having a financial relationship with CFAR. But he notes many times elsewhere that "his employer" has been paying for him to write a rationality textbook, and at times given him paid sabbaticals to finish writing HPMOR because he was able to convince his employer that it was in their interest to fund his fiction writing. As I said I can understand the argument that it would be beneficial to an organization like CFAR to have as fun and interesting an introduction to rationality as HPMOR is, ignoring for a moment the flaws in this particular work I pointed out elsewhere. It makes very little sense for MIRI to do so--I would frankly be concerned about them losing their non-profit status as a result, as writing rationality textbooks let alone harry potter fanfics is so, so far outside of MIRI's mission. But anyway, it appears that I assumed it was CFAR employing him, not MIRI. I wonder if I was alone in this assumption. EDIT: To be clear, MIRI and CFAR have shared history--CFAR is an offshoot of MIRI, and both organizations have shared offices and staff in the past. You staff page lists Eliezer Yudkowsky as a "Curriculum Consultant" and specifically mentions his work on HPMOR. I'll take your word that none of it was done with CFAR funding, but that's not the expectation a reasonable person might have from your very own website. If you want to distance yourself from HPMOR you might want to correct that.
To be clear, I can understand where your impression came from. I don't blame you. I spoke up purely to crush a rumor and clarify the situation. That's a good point. I'll definitely consider it. We're not trying to distance ourselves from HPMOR, by the way. We think it's useful, and it does cause a lot of people to show interest in CFAR. But I agree, as a nonprofit it might be a good idea for us to be clearer about whom we are and are not paying. I'll definitely think about how to approach that.
I was pleasantly surprised by empiricism in HPMOR. It starts out with Harry's father believing that there's no way magic can exist, his mother believing it does and then Harry advocating using the empiric method to find out. Harry runs experiments to find out about the inheritance of magic. He runs experiments where he varies various factors to find out when a spell works with Hermione. What's wrong with that kind of empiricism?

You have exhausted all of the examples that I can recall from the entire series. That's what's wrong.

The rest of the time Harry thinks up a clever explanation, and once the explanation is clever enough to solve all the odd constraints placed on it, (1) he stops looking for other explanations, and (2) he doesn't check to see if he is actually right.

Nominally, Harry is supposed to have learned his lesson in his first failed experimentation in magic with Hermoine. But in reality and in relation to the overarching plot, there was very little experimentation and much more "that's so clever it must be true!" type thinking.

"That's so clever it must be true!" basically sums up the sequence's justification for many-worlds, to tie us back to the original complaint in the OP.

The rest of the time Harry thinks up a clever explanation, and once the explanation is clever enough to solve all the odd constraints placed on it, (1) he stops looking for other explanations, and (2) he doesn't check to see if he is actually right.


Comed-tea in ch. 14

Hariezer decides in this chapter that comed-tea MUST work by causing you to drink it right before something spit-take worthy happens. The tea predicts the humor, and then magics you into drinking it. Of course, he does no experiments to test this hypothesis at all (ironic that just a few chapters ago he lecture Hermione about only doing 1 experiment to test her idea).

Wizards losing their power in chap. 22

Here is the thing about science, step 0 needs to be make sure you’re trying to explain a real phenomena. Hariezer knows this, he tells the story of N-rays earlier in the chapter, but completely fails to understand the point.

Hariezer and Draco have decided, based on one anecdote (the founders of Hogwarts were the best wizards ever, supposedly) that wizards are weaker today than in the past. The first thing they should do is find out if wizards are actually getting weaker. After all, the two

... (read more)
How is this 'literally the exact same logic that ID proponents use?' Creationists fallacize away the concept of natural selection, but I don't see how Harry is being unreasonable, given what he knows about the universe.
He's saying "I don't understand how magic could have come into being, it must have been invented by somebody." When in fact there could be dozens of other alternative theories. I'll give you one that took me only three seconds to think up: the method for using magic isn't a delusion of the caster as Harry thought, but a mass delusion of all wizards everywhere. E.g. confounding every wizard in existence, or at least some threshold to think that Fixus Everthingus was a real spell would make it work. Maybe all it would have take to get his experiments with Hermoine to work is to confound himself as well, making it a double-blind experiment as it really should have been. His argument here really is exactly the same as an intelligent designer: "magic is too complicated and arbitrary to be the result of some physical process."
He actually does kind of address that, by pointing out that there are only two known processes that produce purposeful effects: So, yeah, I disagree strongly that the two arguments are "exactly the same". That's the sort of thing you say more for emphasis than for its being true.
I stand by my claim that they are the same. An intelligent designer says "I have exhausted every possible hypothesis, there must be a god creator behind it all" when in fact there was at least one perfectly plausible hypothosis (natural selection) which he failed to thoroughly consider. Harry says essentially "I have exhausted every possible hypothesis--natural selection and intelligent design--and there must be an Atlantean engineer behind it all" when in fact there were other perfectly plausible arguments such as the coordinated belief of a quorum of wizardkind explanation that I gave.
That doesn't address the question of why magic exists (not to mention it falls afoul of Occam's Razor). You seem to be answering a completely different question.
The question in the story and in this thread was "why purposeful complexity?" not "why magic?"
Your proposal is equally complex, if not more. What's causing the hallucinations?
You may be right, but it is still more parsimonious than your idea (which requires some genuinely bizarre mechanism, far more than it being a self-delusion).
Not really. You've seen the movie Sphere, or read the book? Magic could be similar: the source of magic is a wish-granting device that makes whatever someone with wizard gene think of, actually happen. Of course this is incredibly dangerous--all I have to do is shout "don't think of the Apocalypse!" in a room of wizards and watch the world end. So early wizards like Merlin interdicted by using their magic to implant false memories into the entire wizarding population to provide a sort of basic set of safety rules -- magic requires wands, enchantments have to be said correctly with the right hand motion, creating new spells requires herculean effort, etc. None of that would be true, but the presence of other wizards in the world thinking it were true would be enough to make the wish-granting device enforce the rules anyway.
5Rob Bensinger9y
I think you're missing the point of the Many Worlds posts in the Sequences. I'll link to my response here. Regarding HPMoR, Eliezer would agree that Harry's success rate is absurdly unrealistic (even for a story about witchcraft and wizardry). He wrote about this point in the essay "Level 2 Intelligent Characters": I would agree with you, however, that HPMoR lets Harry intuit the right answer on the first guess too much. I would much prefer that the book prioritize pedagogy over literary directness, and in any case I have a taste for stories that meander and hit a lot of dead ends. (Though I'll grant that this is an idiosyncratic taste on my part.) As a last resort, I think HPMoR could just have told us, in narration, about a bunch of times Harry failed, before describing in more detail the time he succeeded. A few sentences like this scattered throughout the story could at least reduce the message to system 2 that rationalist plans should consistently succeed, even if the different amounts of vividness mean that it still won't get through to system 1. But this is a band-aid; the deeper solution is just to find lots of interesting lessons and new developments you can tell about while Harry fails in various ways, so you aren't just reciting a litany of undifferentiated failures.
There's a difference between succeeding too often and succeeding despite not testing his ideas. The problem isn't having too many failed ideas, the problem is that testing is how one rules out a failed idea, so he seems unreasonably lucky in the sense that his refusal to test has unreasonably few consequences.
Without rereading I can recall experiment with time tuners where Harry finds out "Don't mess with time". But what might be missing is a detailed exploration of "learning from mistakes". Harry get's things right through being smart.

That's an anti-example. He had a theory for how time turners could be used in a clever way to perform computation. His first experiment actually confirmed the consistent-timeline theory of time turners, but revealed the problem domain to be much larger than he had considered. Rather than construct a more rigorous and tightly controlled experiment to get at the underlying nature of timeline selection, he got spooked and walked away. It became a lesson in anti-empiricism: some things you just don't investigate.

Harry get's things right through being smart.

That's exactly the problem. Rationalists get things right by relying on reality being consistent, not any particular smartness. You could be a total numbnut but still be good at checking other people's theories against reality and do better than the smartest guy in the world who thinks his ideas are too clever to be wrong.

So Harry got things right by relying on reality being consistent, until the very end, when reality turned out to be even more consistent than he could have thought. I think it is the most valuable lesson from HPMoR.

Except that it is a piece of fiction. Harry got things right because the author wrote it that way. In reality Harry acting the way Harry did would have been more likely to settle on a clever-sounding theory which he never tested until it was too late and which turned out to be hopelessly wrong and got him killed. But that's not how Yudkowsky chose to write the story.

I agree. Still, even if Harry died, my point would still stand.

100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.

Probably because they're unlikely to lead to anything special over and above general biology research.

Funding for SENS might fund research that could be considered too speculative for more conventional bio funders, though.
That's very cynical. What makes you say that?

Perhaps I'm wrong, but reading your comments, it seems like you mostly disagree with AI. You think that we should focus on developing AI as fast as possible and then boxing it and experimenting with it. Maybe you even go further and think all the work that MIRI is doing is a waste and irrelevant to AI.

If so I totally agree with you. You're not alone. I don't think you should leave the site over it though.

There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten.

It's kind of amazing that an organisation dedicated to empiricism didn't execute an initial phase of research to find out what AI researchers are actually doing. Why, it's almost as if it's intended to be a hobby horse.

"> I am no longer in good conscious

S/b conscience.

Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.

Has FLI done any research to date? My impression is that they're just disbursing Musk's grant (which for the record I think is fantastic).

It sounds like they're trying to disburse the grant broadly and to encourage a variety of different types of research including looking into more short term AI safety problems. This approach seems like it has the potential to engage more of the existing computer science and AI community, and to connect concerns about AI risk to current practice.

Is that what you like about it?

Yes. I'm inferring a bit about what they are willing to fund due to the request for proposals that they have put out, and statements that have been made by Musk and others. Hopefully there won't be any surprises when the selected grants are announced.
FLI's request for proposals links to their research priorities document, which cites 4 MIRI papers and states: So obviously they don't think MIRI's research is a waste of resources or should be deprioritized. If they end up not funding any MIRI projects, I would think it probably has more to do with MIRI having enough funding from other sources than any disagreements between FLI and MIRI over research directions.
Would you be surprised if they funded MIRI?
Depends on what you mean. Kaj Sotala, a research associate at MIRI, has a proposal he submitted to FLI that I think really deserves to be funded (it's the context for the modeling concept formation posts he has done recently). I think it has a good chance and I would be very disappointed if it wasn't funded. I'm not sure if you would count that as MIRI getting funded or not since the organization is technically not on the proposal, I think. If you mean MIRI getting funded to do the sorts of things MIRI has been prominently pushing for in its workshops lately -- basic mathematical research on incomputable models of intelligence, Löbian obstacles, decision theory, etc. -- then I would be very, very disappointed and if it was a significant chunk of the FLI budget I would have to retract my endorsement. I would be very surprised, but I don't consider it an impossibility. I actually think it quite possible that MIRI could get selected for a research grant on something related to, but slightly different from what they would have been working on anyways (I have no idea if they have submitted any proposals). I do think it unlikely and surprising if FLI funds were simply directed into the MIRI technical research agenda. To clarify, my understanding is that FLI was founded largely in response to Nick Bostrom's Superintelligence drumming up concern over AI risk, so there is a shared philosophical underpinning between MIRI and FLI--based on the same arguments I object to in the OP! But if Musk et al believed the MIRI technical agenda was the correct approach to deal with the issue, and MIRI itself was capable of handling the research, then they would have simply given their funds to MIRI and not created their own organization. There is a real different between how these two organizations are approaching the issue, and I expect to see that reflected in the grant selections. FLI is doing the right thing but for the wrong reasons. What you do matters more than why you did it, s

I do think it unlikely and surprising if FLI funds were simply directed into the MIRI technical research agenda.

FLI's request for proposals links to their research priorities document, which cites 4 MIRI papers and states, "Research in this area [...] could extend or critique existing approaches begun by groups such as the Machine Intelligence Research Institute [76]." (Note that [76] is MIRI's technical agenda document.)

So obviously they don't think MIRI's current research is a waste of resources or should be deprioritized. If they end up not funding any MIRI projects, I would think it probably has more to do with MIRI having enough funding from other sources than any disagreements between FLI and MIRI over research directions.

The problem with your point regarding chimpanzees is that it is true only if the chimpanzee is unable to construct a provably friendly human. This is true in the case of chimpanzees because they are unable to construct humans period, friendly or unfriendly, but I don't think it has been established that present day humans are unable to construct a provably friendly superintelligence.

That's wholly irrelevant. The important question is this: which can be constructed faster: a provably-safe-by-design friendly AGI, or a fail-safe not-proven-friendly tool AI? Lives hang in the balance: about 100,000 a day to be exact. (There's an aside about whether an all-powerful "friendly" AI outcome is even desirable--I don't think it is. But that's a separate issue.)
Mark, I get that it's terrible that people are dying. As pointed out in another thread, I support SENS. But there's a disaster response tool called "Don't just do something, stand there!", which argues that taking the time to make sure you do the right thing is worth it, especially in emergencies, when there is pressure to act too soon. Mistakes made because you were in a hurry aren't any less damaging because you were in a hurry for a good reason. I don't think anyone expects that it's possible and desirable to slow down general tech development, and most tool AI is just software development. If I write software that helps engineers run their tools more effectively, or a colleague writes software that helps doctors target radiation at tumors more effectively, or another colleague writes software that helps planners decide which reservoirs to drain for electrical power, that doesn't make a huge change to the trajectory of the future; each is just a small improvement towards more embedded intelligence and richer, longer lives.
So you don't think the invention of AI is inevitable? If it is, shouldn't we pool our resources to find a formula for friendliness before that happens? How could you possibly prevent it from occurring? If you stop official AI research, that will just mean gangsters will find it first. I mean, computing hardware isn't that expensive, and we're just talking about stumbling across patterns in logic here. (If an AI cannot be created that way, then we are safe regardless.) If you prevent AI research, maybe the formula won't be discovered in 50 years, but are you okay with an unfriendly AI within 500? (In any case, I think your definition of friendliness is too narrow. For example, you may disagree with EY's definition of friendliness, but you have your own: Preventing people from creating an unfriendly AI will take nothing less than intervention from a friendly AI.) (I should mention that unlike many LWers, I don't want you to feel at all pressured to help EY build his friendly AI. My disagreement with you is purely intellectual. For instance, if your friendliness values differ from his, shouldn't you set up your own rival organization in opposition to MIRI? Just saying.)

" To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” "

I couldn't agree more!

Philosophy is getting too much flak, I think. It didn't take me a lot of effort to realize that any correct belief we have puts us on some equal footing with the AI.

You're anthropomorphizing far too much. It's possible for things which are easy for us to think to be very difficult for an AGI, if it is constructed in a different way.
Um... I don't follow this at all. Where do I anthropomorphize? I think that the concept of 'knowledge' would be pretty universal.
Hrm. Maybe I'm reading you wrong? I thought you were making a commonly made argument that any belief we arrive at, the artificial intelligence would as well. And my response was that because the AI is running different mental machinery, it is entirely possible that there are beliefs we arrive at which the AI just doesn't consider and vice versa. Were you saying something different?
You, quite rightly, criticized the notion that an AI would have just as much of an epistemic advantage on humans that we would have on a cow. Any correct notions we have are just that- correct. It's not like there's more truth to them that a godlike intelligence could know.

SENS is fundamentally in competition with dozens or hundreds of profit seeking organizations in the world. Donations to SENS are like donations to a charity researching better plastic surgery techniques. They will get invented no matter what, and the amount of money you can throw at it is trivial compared to the potential customer base of said techniques. If you cure aging, billionaires everywhere will fall over themselves to give you money.

The things that SENS is working on right now are not ready for investment. I'm going to call you out on this one: please name something SENS is or has researched which is or was the subject of private industry or taxpayer research at the time that SENS was working on it. I think you'll find that such examples, if they exist at all, are isolated. It is nevertheless the goal of SENS to create a vibrant rejuvenation industry with the private sector eventually taking the reins. But until then, there is a real need for a non-profit to fund research that is too speculative and/or too far from clinical trials to achieve return-on-investment on a typical funding horizon.
I generally quite agree with you here. I really enormously appreciate the effort SENS is putting into addressing this horror, and there does seem to be a hyperbolic discounting style problem with most of the serious anti-aging tech that SENS is trying to address. But I think you might be stating your case too strongly: If I recall correctly, one of Aubrey's Seven Deadly Things is cancer, and correspondingly one of the seven main branches of SENS is an effort to eliminate cancer via an idea Aubrey came up with via inspiration. (I honestly don't remember the strategy anymore. It has been about six years since I've read Ending Aging.) If you want to claim that no one else was working on Aubrey's approach to ending all cancers or that anyone else doing it was isolated, I think that's fair, but kind of silly. And obviously there's a ton of money going into cancer research in general, albeit I wouldn't be surprised if most of it was dedicated to solving specific cancers rather than all cancer at once. But I want to emphasize that this is more of a nitpick on the strength of your claim. I agree with the spirit of it.
What I'm saying is the actual research project being funded by SENS are those which are not being adequately funded elsewhere. For example, stem cell therapy is one of the seven pillars of the SENS research agenda, but SENS does almost no work on this whatsoever because it is being adequately funded elsewhere. Likewise, cancer forms another pillar of SENS research, but to my knowledge SENS has only worked on avenues of early-stage research that is not being pursued elsewhere, like the case you mentioned. I interpreted drethelin's comment as saying that donating to SENS was a waste of money since it's a drop in the bucket compared to for-profit and government research programs. My counter-point is that for-profit and public programs are not pursuing the same research as SENS is doing.
I think the consensus in the field is at the moment that cancer isn't a single thing. Therefore "solve all cancer at once" unfortunately doesn't make a good goal.
That's my vague impression too. But if I remember correctly, the original idea of OncoSENS (the part of SENS addressing cancer) was something that in theory would address all cancer regardless of type. I also seem to recall that most experimental biologists thought that many of Aubrey's ideas about SENS, including OncoSENS, were impractical and that they betrayed a lack of familiarity with working in a lab. (Although I should note, I don't really know what they're talking about. I, too, lack familiarity with working in a lab!)
I just reread the OncoSENS page. The idea was to nuke the gene for telomerase from every cell in the body and also nuke a gene for alternative lengthening of telomeres (ALT). Nuking out telomerase in every cell doesn't need further cancer research but gene therapy research. Albeit most gene therapy is about adding gene's instead of deleting them. As far as research funding goes SENS seems to be currently funding research into ALT ( I think it's plausible that ALT research is otherwise underfunded but I don't know the details.
Sort of. It might be nice if, say, a cure for diabetes was owned by a non-profit that released it into the public domain rather than by a corporation that would charge for it. (Obviously forcing corporations to be non-profits has predictably terrible side effects, and so the only proper way to do this is by funding participants in the race yourself.) From the perspective of the for-profits, a single non-profit competitor only slightly adjusts their environment, and so may not significantly adjust their incentive structure. It would seem that the sooner a treatment is developed / cure found, the less people will suffer or die from the disease. Moving that sooner seems like a valuable activity.

I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good.

To your own cognition, or just to that of others?

I just got here. I have no experience with the issues you cite, but it strikes me that disengagement does not, in general, change society. If you think ideas, as presented, are wrong - show the evidence, debate, fight the good fight. This is probably one of the few places it might actually be acceptable - you can't lurk on religious boards and try to convince them of things, they mostly canno... (read more)

Sorry to see you going, your contrarian voice will be missed.

Wanted to mention that Intentional Insights is a nonprofit specifically promoting rationality for the masses, including encouraging empirical and evidence-based approaches (I'm the President). So consider recommending us in the future, and get in touch with me at if you want to discuss this.

Of course you did.
What did you do from which you learned empirically? Which assumptions that you had when you started, have you managed to falsify?
Well, one thing we learned from empirically was that the most responsive audience to our content was in the secular community. We assumed that people into health and wellness would be more responsive, but our content is not sufficiently "woo" for most. So we decided to focus our early efforts on spreading rationality among the secular community, and then branch out later after we develop solid ties there.