Suppose we could look into the future of our Everett branch and pick out those sub-branches in which humanity and/or human/moral values have survived past the Singularity in some form. What would we see if we then went backwards in time and look at how that happened? Here's an attempt to answer that question, or in other words to enumerate the not completely disastrous Singularity scenarios that seem to have non-negligible probability. Note that the question I'm asking here is distinct from "In what direction should we try to nudge the future?" (which I think logically ought to come second).
- Uploading first
- Become superintelligent (self-modify or build FAI), then take over the world
- Take over the world as a superorganism
- self-modify or build FAI at leisure
- (Added) stasis
- Competitive upload scenario
- (Added) subsequent singleton formation
- (Added) subsequent AGI intelligence explosion
- no singleton
- IA (intelligence amplification) first
- Clone a million von Neumanns (probably government project)
- Gradual genetic enhancement of offspring (probably market-based)
- Direct brain/computer interface
- What happens next? Upload or code?
- Code (de novo AI) first
- Scale of project
- Large Corporation
- Small Organization
- Secrecy - spectrum between
- totally open
- totally secret
- Planned Friendliness vs "emergent" non-catastrophe
- If planned, what approach?
- "Normative" - define decision process and utility function manually
- "Meta-ethical" - e.g., CEV
- "Meta-philosophical" - program AI how to do philosophy
- If emergent, why?
- Objective morality
- Convergent evolution of values
- Acausal game theory
- Standard game theory (e.g., Robin's idea that AIs in a competitive scenario will respect human property rights due to standard game theoretic considerations)
- If planned, what approach?
- Competitive vs. local FOOM
- Scale of project
- (Added) Simultaneous/complementary development of IA and AI
Sorry if this is too cryptic or compressed. I'm writing this mostly for my own future reference, but perhaps it could be expanded more if there is interest. And of course I'd welcome any scenarios that may be missing from this list.
Another scenario: (5) near-lightspeed dispersal. Pre-Singularity humanity colonizes universe (or just flies away in every direction) at near lightspeed and whatever the outcome of the Singularity on Earth, or secondary singularities on some of the ships, they never wipe out the whole diaspora.
The competitive upload scenario allows for subsequent singleton formation, as well as subsequent AGI intelligence explosion.
Thanks, I've added those sub-scenarios to the list.
There's a subscenario of c.ii that I think is worth considering: there turns out to be some good theoretical reason why even an AGI with access to and full-stack understanding of its own source code cannot FOOM - a limit of some sort on the rate of self-improvement. (Or is this already covered by D?)
In this subscenario, does the AGI eventually become superintelligent? If so, don't we still need a reason why it doesn't disassemble humans at that point, which might be A, B, C or D?
XiXiDu seemed to place importance on the possibility of "expert systems" that don't count as AGI beating the general intelligence in some area. Since we were discussing risk to humanity, I take this to include the unstated premise that defense could somehow become about as easy as offense if not easier. (Tell us if that seems wrong, Xi.)
I guess that is D I'm thinking of.
We figure out what sort of stuff brains are and can make more of it in machine, but cannot scan it well. We make machines made of it which are non goal seeking but are hooked into a humans dopamine system in some way. Slowly we create higher and higher bandwidth connections between the two. It is treated as a semi- black box for understanding things, much like the motor cortex is treated as a black box for moving our bodies. This can lead to upload eventually.
Also related it may be theoretically better to have many agent with many goals to avoid getting stuck in local optima or with crazy ideas. See greater than linear speed up of parallel GAs.
Edit: To expand my second point. I don't think that groups of smaller intelligences will always beat out larger ones. Simply that I would expect that a mixed set of uploads will do better than a monoculture of clones. As the mixed set will consider different things important and explore different parts of concept space, and some will get lucky with fundamental break throughs. Clones won't do well if they all get obsessed with P vs NP and it is undecidable, for example.
I can conceive of AI designs that manage to keep diversity of concepts explored somehow, but I don't think it will come naturally from uploading.
Two years later: do you really think objective morality has non-negligible probability? Or convergent evolution of values (without actual evolution) for that matter? (I'm having trouble imagining a reproduction with modification AI scenario, but maybe there is one.) Personally, I would be beyond shocked if either of those turned out to be the case.
On 2e: my naive intuition is that for it to count as a "true" IA singularity and not just one of the other kinds of singularity helped along by IA, it'd have to be of the form "Brain-computer interfaces means the minds have both a wetware and a software part, the software part grows much more quickly, and after a while you can just remove the wetware without much of value being lost." or something like that.
A fascinating thought. Do you assign more than negligible probability to this being true? Do you plan to elaborate on this point and others? I assume that it would be stupid to hope for some sort of "emergent friendliness", but it is nonetheless a very interesting speculation.
I think he was referring to something like this amazing story by Yvain. We have no idea if that's how negotiations between rational agents should work, but it's a possibility.
Upvote for story link :)
I haven't seen that story before but it is excellent and intriguing. Has there been any prior discussion of it you could link to?
I got it either here or here, but neither has a discussion. The link's in Wei Dai's reply cover the same subject matter, but do not make direct reference to the story.
As I see nowhere else particularly to put it, here's a thought I had about the agent in the story, and specifically whether the proposed system works if not all other entities subscribe to it.
There is a non-zero probability that there exists/could exist an AI that does not subscribe to the system outlined of respecting other AIs values. It is equally probable that his AI was created before me or after me. Given this, if it already exists I can have no defence against it. If it does not yet exist I am safe from it, but must act as much as possible to prevent it being created as it will prevent my values being established. Therefore I should eliminate all other potential sources of AI.
[I may retract this after reading up on some of the acausal game theory stuff if I haven't understood it correctly. So apologies if I have missed something obvious]
I think you might be right; it is very unlikely that all civilizations get AI right enough for all the AIs to understand acausal considerations. I don't know why you were downvoted.
Does the fact of our present existence tell us anything about the likelihood for a human-superior intelligence to remain ignorant of acausal game theory?
Anthropically, UDT suggests that a variant of SIA should be used [EDIT - depending on your ethics]. I'm not sure what exactly that implies in this scenario. It is very likely that humans could program a superintelligence that is incapable of understanding acausal causation. I trust that far more than I trust any anthropic argument with this many variables. The only reasonably likely loophole here is if anthropics could point to humanity being different than most species so that no other species in the area would be as likely to create a bad AI as we are. I cannot think of any such argument, so it remains unlikely that all superhuman AIs would understand acausal game theory.
Depending on your preferences about population ethics, and the version of the same issues applying to copies. E.g. if you are going to split into many copies, do you care about maximizing their total or their average welfare? The first choice will result in SIA-like decision making, while the latter will result in SSA-like decision making.
Somewhat OT: the last part seems wrong at first glace. I hope that feeling only reflects my biases, because the argument could explain why an entity capable of affecting distant Everett branches -- say, in order to prevent the torture of sufficiently similar entities -- would have left no visible trace on our own timeline so far.
(In the cheerful interpretation, which assumes something like Mangled-Worlds, said entity has enough fine control to not only change branches that would otherwise lead to no entity-level intelligence, but also to undetectably backup memories from every branch that would discard unique memories.)
It seems on second look as if the argument might work -- at least if the entity doesn't refuse to update on the existence of someone who can form a belief based on the evidence, but only ignores some other facts about that entity. Then the answer it arrives at seems to have a clearer meaning.
I'm not sure what you mean by most of this. What is the "last part" and why does it seem wrong? Why do you hope that the idea presented in this story is correct? There seem to be too many factors to determine whether it would be better than some unknown alternative. What does this have to do with many worlds/mangled worlds? The story would still work in a classical universe.
The story makes two claims about decision theory. One of them explains the ending, hence "the last part". This claim leads to odd questions which cast doubt on it. (Note that the linked post itself links to a better trap in the comments.)
If the argument does work, it would apply to hypothetical entities with certain (highly desirable) powers in some forms of Many-Worlds. By "something like Mangled-Worlds" I meant a theory that restricts the set of Everett branches containing intelligent observers. Such a theory might assign P=0 to a particular branch producing any entity of the relevant power level. This could make the story's argument relevant to our hypothetical branch-editor.
Can you spell out the two claims?
The first objection in the post holds; my decisions are not acausally connected to those of my ancestors in a way that would provide a valid reason to act differently. How I respond to that LW post is a question that never came up in the ancestral environment; only decisions not caused by thinking about decision theory can control whether I exist.
In this specification of transparent Newcomb, one-boxing is correct.
So you're saying that it could also explain the equivalent Fermi-like paradox that asks why beings with Everett-branch jumping powers haven't interfered with us in any way? I agree that, if it explains the Fermi paradox it applies to this scenario too, but I think it is much more likely that Everett-branch jumping is just impossible, as it is according to our current understanding of QM.
Yes, the argument would only remove a reason for seeing this as a strict logical impossibility (for us).
Sufficiently smart AGI precommits to cooperate with every other super-intelligence it meets that has made a similar precommitment. This acausally ensures that a big set of super-minds will cooperate with the AGI if they meet it, thereby producing huge tracts of expected value.
The AGI also precommits to cooperate with some super-minds that don't exist yet, by leaving their potential creators alone -- it won't interfere in the slightest with any species or star system that might produce a super-mind. This protects the AGI from counterfactual interference that would have prevented its existence, and more importantly, protects it from retaliation by hypothetical super-minds that care about protection from counterfactuals. 2.1: It does not precommit to leaving its own creators alone so they have a chance to create paperclip-maximizers in all shapes and sizes. The AGI's simulation of a stronger mind arose before any other super-mind, and knows this holds true for its own planet -- so the sim does not care about the fate of counterfactual future rivals from said planet. Nor does the AGI itself perceive a high expected value to negotiating with people it decided to kill before it could start modelling them.
As for the problem with #2, while I agree that the trap in the linked OP fails, the one in the linked comment seems valid. You still have to bite the bullet and accommodate the whims of parents with unrealistically good predictive abilities, in this hypothetical. (I guess they taught you about TDT for this purpose.) Or let's say that branch-jumping works but the most cheerful interpretation of it does not -- let's say you have to negotiate acausally with a misery-maximizer and a separate joy-minimizer to ensure your existence. I don't know exactly how that bullet would taste, but I don't like the looks of it.
It could make this precommitment before before learning that it was the oldest on its planet. Even if it did not actually make this precommitment, a well-programmed AI should abide by any precommitments it would have made if it had thought of them; otherwise it could lose expected utilons when it faces a problem that it could have made a precommitment about, but did not think to do so.
That scenario is equivalent to counterfactual mugging, as is made clearer by the framework of UDT, so this bullet must simply be bitten.
What implications do you draw from this? I can see how it might have a practical meaning if the AI considers a restricted set of minds that might have existed. But if it involves a promise to preserve every mind that could exist if the AI does nothing, I don't see how the algorithm can get a positive expected value for any action at all. Seems like any action would reduce the chance of some mind existing.
(I assume here that some kinds of paperclip-maximizers could have important differences based on who made them and when. Oh, and of course I'm having the AI look at probabilities for a single timeline or ignore MWI entirely. I don't know how else to do it without knowing what sort of timelines can really exist.)
Some minds are more likely to exist and/or have easier-to-satisfy goals than others. The AI would choose to benefit its own values and those of the more useful acausal trading partners at the expense of the values of the less useful acausal trading partners.
Also the idea of a positive expected value is meaningless; only differences between utilities count. Adding 100 to the internal representation of every utility would result in the same decisions.
It's not my original idea. See comments by Carl Shulman and Vladimir Nesov. Gary Drescher also mentioned in conversation a different way in which acausal considerations might lead superintelligent AIs to treat us ethically. I'm not sure if he has written about it anywhere. (ETA: See page 287 of his book.)
From my point of view, any singularity scenario would be disastrous.
Could you explain? Or, if you already have, link to an article or comment where you did so?
Intelligence amplification and artificial intelligence are listed as separate scenarios.
Today, we have both machine-amplified human intelligence and machine intelligence - and that situation is likely to persist until we have intelligent machines that are roughly as smart as humans. Intelligence augmentation and machine intelligence are complementary - and are not in competition - as I explain here. Both processes are significant, it seems. Machines are built to compensate for our weaknesses. Such scenarios are the ones we are going to get - and they do not seem to fit into the categorisation scheme very well.
I think the natural way to classify that is to look at when the pure machine intelligences exceed the augmented humans in aggregate intelligence/power/wealth. If it happens at significantly higher than baseline human level intelligence, then I'd classify that as IA first, otherwise I'd classify it as upload or code first depending on the nature of the machine intelligences. (And of course there will always be "too close to call" cases.)
So: by far the most important human augmentation in the future is going to involve preprocessing sensory inputs using machines, post-processing motor outputs by machines, and doing processing that bypasses the human brain entirely. Not drugs, or eduction, or anything else.
In such scenarios, the machines won't ever really "overtake" the augmented humans, they will just catch up with them. So, for instance, a human with a robot army is not functionally very much different from a robot army. Eventually the human becomes unnecessary and becomes a small burden, but that hardly seems very significant. So: the point that you are talking about seems to be far future, difficult to measure, and seems to me to be inappropriate as the basis of a classification scheme.
I realized that comparing machines with augmented humans on an individual level doesn't make much sense, and edited in "in aggregate intelligence/power/wealth", but apparently after you already started your reply. Does the new version seem more reasonable?
As I see it, your proposed classification scheme perpetuates the notion that Intelligence augmentation and machine intelligence are alternatives to each other. If you see them as complementary, using the distinction between them as the basis of a classification scheme makes little sense. They are complementary - and are not really alternatives.
Yes, it would be fun if there was some kind of viable intelligence augmentation-only way forwards - but that idea just seems delusional to me. There's no such path. Convergence means nanotech and robotics converge. It also means that intelligence augmentation and machine intelligence converge.
The fact that they are complementary doesn't exclude the possibility that one could occur earlier than the other. For example do you think there is negligible chance that genetic engineering or pharmaceuticals could significantly improve human intelligence before machine intelligence gets very far off the ground? Or that roughly baseline humans could create (or stumble onto) a recursively-improving AI?
On the other hand, the "too close to call" case does seem to deserve it's own category, so I've added one. Thanks!
Embryo selection technology already exists, all that's needed is knowledge of the relevant alleles to select for, which should be forthcoming shortly given falling sequencing costs. Within 30 years we should see the first grown-up offspring of such selection. The effect will be greatly amplified if stem cell technology makes it possible to produce viable gametes from stem cells, which the Hinxton report estimates to be within a decade too.
Germ-line genetic engineering is almost totally impotent - since it is too slow. Gene therapy is potentially faster - but faces considerably more technical challenges. It is also irrelevant, I figure.
Pharmaceuticals might increase alertness or stamina, but their effects on productivity seem likely to be relatively minor. People have been waiting for a pharmaceutical revolution since the 1960s. We already have many of the most important drugs, it is mostly a case of figuring out how best to intelligently deploy them.
The main player on the intelligence augmentation front that doesn't involve machines very much is education - where there is lots of potential. Again, this is not really competition for machine intelligence. We have education now. It would make little sense to ask if it will be "first".