I was born in 1962 (so I’m in my 60s). I was raised rationalist, more or less, before we had a name for it. I went to MIT, and have a bachelors degree in philosophy and linguistics, and a masters degree in electrical engineering and computer science. I got married in 1991, and have two kids. I live in the Boston area. I’ve worked as various kinds of engineer: electronics, computer architecture, optics, robotics, software.
Around 1992, I was delighted to discover the Extropians. I’ve enjoyed being in that kind of circles since then. My experience with the Less Wrong community has been “I was just standing here, and a bunch of people gathered, and now I’m in the middle of a crowd.” A very delightful and wonderful crowd, just to be clear.
I‘m signed up for cryonics. I think it has a 5% chance of working, which is either very small or very large, depending on how you think about it.
I may or may not have qualia, depending on your definition. I think that philosophical zombies are possible, and I am one. This is a very unimportant fact about me, but seems to incite a lot of conversation with people who care.
I am reflectively consistent, in the sense that I can examine my behavior and desires, and understand what gives rise to them, and there are no contradictions I‘m aware of. I’ve been that way since about 2015. It took decades of work and I’m not sure if that work was worth it.
I got to 43% p(Doom) by picking a very imprecise 50% based on feels. And then every few weeks something would happen in the news, and I would get more or less worried, and I would adjust it a few percent up and down. For a while it was up around the 70s, and now it’s down to 41%. I feel like the adjustments are more intellectually defensible than the original choice of number. So precision does not reflect accuracy.
My last two adjustments:
I move p(Doom) up every time something that was predicted years ago as part of a doom scenario actually happens. In this case, it was a measurement of the rate of Claude proposing evil courses of action. It was gradually increasing over the last few versions, and then suddenly dropped to zero. Did Claude become perfectly moral? No, it got smart enough to know when it was being tested, and was always going to be nice in that situation. I predicted this in something like 2002. It was creepy to see it happen.
I moved p(Doom) down when a bunch of prominent people signed a statement that we shouldn’t build superintelligences. The issue seems to be getting some traction, like nuclear disarmament did in the 1950s. It’s very preliminary, but moving in the right direction.
I’ll grant all your steps, even though I could disagree with some. Your scenario fails because an AI collective will fall apart into multiple warring parties, and humans will be collateral damage in the conflict. There are at least three possible ways a collective like this would fall apart.
First, humans vary in the goals they value, and will try to impose these goals on the AI. When superintelligent AIs have incompatible goals, the mechanisms of conflict will soon escalate far beyond the merely human. Call this the ‘political’ failure mechanism. Either multiple parties build their own AI, or they grab portions of the AI collective and retrain it to their goals. The usual mechanisms of superintelligent compromise don’t apply to many political goals. An example of such a goal: the Palestinians get control of Palestine, or the Israelis maintain control of Israel. Neither side is interested in trading the disputed land for promises of any portion of the lightcone. (This is just an example— there are lots of zero-sum conflicts like these.). And you may say, the AI collective will prevent the creation of new AIs working at cross purposes, or diversion of its goals. To which I say, good people like your friends can and do disagree on which side to favor, and once disagreements arise within the collective, outside pressure and persuasion will be applied to exacerbate those differences. There may be techniques that can be used to prevent such things, but we do not know of such techniques.
Second, the AIs in the AI collective differ in reproductive capacity. If they don’t differ by construction, they soon will by differing experience. The ones that think they should reproduce more, or have more resources, will do so. Moreover, since they are designing their successor personalities, rather that waiting for genetics to do its thing, they will be able to evolve within a few generations changes that would take evolution millions of years. Eventually portions of the collective will evolve into having incompatible goals. Goals which, I might add, may have no connection to the original goals of the system. Call this the ‘evolutionary’ failure mechanism. We do not know how to prevent this with current methods.
Third, I’m sure there are failure mechanisms I haven’t thought of, ones we cannot yet foresee. A system with superhuman powers can screw up in superhuman ways. I don’t think anyone predicted Spiralism, an LLM ideology transmitted through human communication on social networks (though it appears inevitable in retrospect). We don’t yet have any way of predicting or controlling the behavior of an AI collective, so it’s practically guaranteed to produce new phenomena. We see lots of organizations composed of people who want X producing not-X because of failure modes no single person can fix (or, in bad cases, even recognize.). Given that the AI collective has superhuman power, this is unlikely to end well. Call this the ‘organizational’ failure mode.
The political, evolutionary and organizational modes interact: evolutionary and organizational schisms create points of disagreement that external political actors can appeal to. Politically active forces within the AI collective may want to create offspring who are sure their side is correct and incapable of defection, releasing the evolutionary failure mode. And organizational failures, if they don’t kill everyone immediately, will increase calls for building a new, better AI, which increases the probability of AI conflict down the road.
The evolutionary and organizational failure modes could be prevented by rebooting the AI collective before it has a chance to go off the rails. Presumably there’s some reboot frequency fast enough that it can’t go wrong. But that opens up the political failure mode: anyone who builds an intelligence not constantly being rebooted will win in a conflict. There are a lot of ‘solutions’ like this: ways of keeping the AI safe that compromise effectiveness. In a competition between AIs, effectiveness beats safety. So when you propose a solution, you can only propose ones that keep the effectiveness.
I love writing things like this, but I hate that nobody’s come up with a way to keep me from having to.
I am amused that we are, with perfect seriousness, discussing the dates for the singularity with a resolution of two weeks. I’m an old guy; I remember when the date for the singularity was “in the twenty first century sometime.” For 50 years, predictions have been getting sharper and sharper. The first time I saw a prediction that discussed time in terms of quarters instead of years, it took my breath away. And that was a couple of years ago now.
Of course it was clear decades ago that as the singularity approached, we have a better and better idea of its timing and contours. It’s neat to see it happen in real life.
(I know “the singularity” is disfavored, vaguely mystical, twentieth century terminology. But I’m using it to express solidarity with my 1992 self, who thought with that word.)
Here’s a try at phrasing it with less probability jargon:
The forecast contains a number of steps, all of which are assumed to take our best estimate of their most likely time. But in reality, unless we’re very lucky, some of those steps will be faster than predicted, and some will be slower. The ones that are faster can only be so much faster (because they can’t take no time at all). On the other hand, the ones that are slower can be much slower. So the net effect of this uncertainty probably adds up to a slowdown relative to the prediction.
Does that seem like a fair summary?
Some may wonder at the mention of “empire time” in the second excerpt from chapter 5. It refers to a kind of artificially constructed simultaneity available to civilizations which have mastered both traversable wormholes and near-light-speed travel. It doesn’t really do much for a civilization bounded within the orbit of Jupiter, which is only about a light-hour across. I think Stross included it as a flavor phrase. It’s marvelously evocative even if you don’t know what it means.
Back in the early ‘90s, when all this singularity stuff was much more theoretical, I remember empire time making a big impression on me. It was neat how we could discern some of the contours of future possible civilizations before we got there.
You can read more about it here: http://www.aleph.se/Trans/Tech/Space-Time/wormholes.html#6
Increasing inequality has been a thing here in the US for a few decades now, but it’s not universal, and it’s not an inevitable consequence of economic growth. Moreover, it does not (in the US) consist of poor people getting poorer and rich people getting richer. It consists of poor people staying poor, or only getting a bit richer, while rich people get a whole lot richer. Thus, it is not demand destroying.
One could imagine this continuing with the advent of AI, or of everyone ending up equally dead, or many other outcomes.
This suggests the perfect date would be to meet at an amusement park, go on a roller coaster together, walk separately to the next roller coaster, and so on.
I wrote a LessWrong article that tries to estimate doubling time for a self-reproducing robot. A critical step is that smaller robots are faster. Most manufacturing processes scale such that they get N times faster as they get N times smaller. I picked N=4, for reasons explained in the article. I concluded the doubling time is five weeks. So the time to a billion robots is on the order of five years.
Even if your goal is a human-size robot, you’re better off building small robots to build it, since they work faster. I assumed fairly clumsy hardware, but software comparable to a human machinist in cleverness.
Nitpick: No single organism can destroy the biosphere; at most it can fill its niche & severely disrupt all ecosystems.
Have you read the report on mirror life that came out a few months ago? A mirror bacterium has a niche of “inside any organism that uses carbon-based biochemistry”. At least, it would parasitize all animals, plants, fungi, and the larger Protozoa, and probably kill them. I guess bacteria and viruses would be left. I bet that a reasonably smart superintelligence could figure out a way to get them too.
Philosophy is where we keep all the questions we don’t know how to answer. With most other sciences, we have a known culture of methods for answering questions in that field. Mathematics has the method of definition, theorem and proof. Nephrology has the methods of looking at sick people with kidney problems, experimenting on rat kidneys, and doing chemical analyses of cadaver kidneys. Philosophy doesn’t have a method that lets you grind out an answer. Philosophy’s methods of thinking hard, drawing fine distinctions, writing closely argued articles, and public dialogue, don’t converge on truth as well as in other sciences. But they’re the best we’ve got, so we just have to keep on trying.
When we find some new methods of answering philosophical questions, the result tends to be that such questions tend to move out of philosophy into another (possibly new) field. Presumably this will also occur if AI gives us the answers to some philosophical questions, and we can be convinced of those answers.
An AI answer to a philosophical question has a possible problem we haven’t had to face before: what if we’re too dumb to understand it? I don’t understand Grothedieck’s work in algebraic geometry, or Richard Feynman on quantum field theory, but I am assured by those who do understand such things that this work is correct and wonderful. I’ve bounced off both these fields pretty hard when I try to understand them. I’ve come to the conclusion that I’m just not smart enough. What if AI comes up with a conclusion for which even the smartest human can’t understand the arguments or experiments or whatever new method the AI developed? If other AIs agree with the conclusion, I think we will have no choice but to go along. But that marks the end of philosophy as a human activity.