Thanks for taking the time to explain your reasoning, Mark. I'm sorry to hear you won't be continuing the discussion group! Is anyone else here interested in leading that project, out of curiosity? I was getting a lot out of seeing people's reactions.
I think John Maxwell's response to your core argument is a good one. Since we're talking about the Sequences, I'll note that this dilemma is the topic of the Science and Rationality sequence:
In any case, right now you've got people dismissing cryonics out of hand as "not scientific", like it was some kind of pharmaceutical you could easily administer to 1000 patients and see what happened. "Call me when cryonicists actually revive someone," they say; which, as Mike Li observes, is like saying "I refuse to get into this ambulance; call me when it's actually at the hospital". Maybe Martin Gardner warned them against believing in strange things without experimental evidence. So they wait for the definite unmistakable verdict of Science, while their family and friends and 150,000 people per day are dying right now, and might or might not be savable—
—a calculated bet you could only make rationally [i.e., using your own inference skills, without just echoing data from an experimental study, and without just echoing established, expert-verified scientific conclusions].
The drive of Science is to obtain a mountain of evidence so huge that not even fallible human scientists can misread it. But even that sometimes goes wrong, when people become confused about which theory predicts what, or bake extremely-hard-to-test components into an early version of their theory. And sometimes you just can't get clear experimental evidence at all.
Either way, you have to try to do the thing that Science doesn't trust anyone to do—think rationally, and figure out the answer before you get clubbed over the head with it.
(Oh, and sometimes a disconfirming experimental result looks like: "Your entire species has just been wiped out! You are now scientifically required to relinquish your theory. If you publicly recant, good for you! Remember, it takes a strong mind to give up strongly held beliefs. Feel free to try another hypothesis next time!")
This is why there's a lot of emphasis on hard-to-test ("philosophical") questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions -- because sometimes (e.g., in the case of cryonics and existential risk) the answer matters a lot for our decision-making, long before we have a definitive scientific answer. That doesn't mean we should despair of empirically investigating these questions, but it does mean that our decision-making needs to be high-quality even during periods where we're still in a state of high uncertainty.
The Sequences talk about the Many Worlds Interpretation precisely because it's an unusually-difficult-to-test topic. The idea isn't that this is a completely typical example, or that it's a good idea to disregard evidence when it is available; the idea, rather, is that we sometimes do need to predicate our decisions on our best guess in the absence of perfect tests.
Its placement in Rationality: From AI to Zombies immediately after the 'zombies' sequence (which, incidentally, is an example of how and why we should reject philosophical thought experiments, no matter how intuitively compelling they are, when they don't accord with established scientific theories and data) is deliberate. Rather than reading either sequence as an attempt to defend a specific fleshed-out theory of consciousness or of physical law, they should primarily be read as attempts to show that extreme uncertainty about a domain doesn't always bleed over into 'we don't know anything about this topic' or 'we can't rule out any of the candidate solutions'.
We can effectively rule out epiphenomenalism as a candidate solution to the hard problem of consciousness even if we don't know the answer to the hard problem (which we don't), and we can effectively rule out 'consciousness causes collapse' and 'there is no objective reality' as candidate solutions to the measurement problem in QM even if we don't know the answer to the measurement problem (which, again, we don't). Just advocating 'physicalism' or 'many worlds' is a promissory note, not a solution.
In discussions of EA and x-risk, we likewise need to be able to prioritize more promising hypotheses over less promising ones long before we've answered all the questions we'd like answered. Even deciding what studies to fund presupposes that we've 'philosophized', in the sense of mentally aggregating, heuristically analyzing, and drawing tentative conclusions from giant complicated accumulated-over-a-lifetime data sets.
You wrote:
The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available.
That's true, and it's one of the basic assumptions behind MIRI research: that understanding agents smarter than us isn't obviously hopeless, because our human capacity for abstract reasoning makes it possible for us to model systems even when they're extremely complex and dynamic. MIRI's research is intended to make this likelier to happen.
It's not the default that we're always able to predict what our inventions will do before we run them to see what happens; and there are some basic limits on our ability to do so when the system we're predicting is smarter than the predictor. But with enough intellectual progress we may become able to model abstract safety-relevant features of AGI behavior, even though we can't predict in detail the exact decisions the AGI will make. (If we could predict the exact decisions of the AGI, we'd have to be at least as smart as the AGI.)
If it isn't possible to learn a variety of generalizations about smarter autonomous systems, then, interestingly, that also undermines the case for intelligence explosion. Both 'humans trying to make superintelligent AI safe' and 'AI undergoing a series of recursive self-improvements' are cases where less intelligent agents are trying to reliably generate agents that meet various abstract criteria (including superior intelligence). The orthogonality thesis, likewise, simultaneously supports the claim 'many possible AI systems won't have humane goals' and 'it is possible for an AI system to have human goals'. This is why Bostrom/Yudkowsky-type arguments don't uniformly inspire pessimism.
Are you familiar with MIRI's technical agenda? You may also want to check out the AI Impacts project, if you think we should be prioritizing forecasting work at this point rather than object-level mathematical research.
This is why there's a lot of emphasis on hard-to-test ("philosophical") questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions -- because sometimes [..] the answer matters a lot for our decision-making,
Which is one of the ways in which beliefs that don't pay rent do pay rent.
You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.
As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.
Philosophy as the anti-science...
What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.
This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.
A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.
The lens that sees its own flaws...
Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.
I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.
And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.
I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.
What next?
How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.
A note about effective altruism…
One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.
Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.
This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.
How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).
Where should I send my charitable donations?
Aubrey de Grey's SENS Research Foundation.
100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.
If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:
I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.
Addendum regarding unfinished business
I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.
EDIT: Obviously I'll stick around long enough to answer questions below :)