Related Posts: A cynical explanation for why rationalists worry about FAIA belief propagation graph

Lately I've been pondering the fact that while there are many critics of SIAI and its plan to form a team to build FAI, few of us seem to agree on what SIAI or we should do instead. Here are some of the alternative suggestions offered so far:

  • work on computer security
  • work to improve laws and institutions
  • work on mind uploading
  • work on intelligence amplification
  • work on non-autonomous AI (e.g., Oracle AI, "Tool AI", automated formal reasoning systems, etc.)
  • work on academically "mainstream" AGI approaches or trust that those researchers know what they are doing
  • stop worrying about the Singularity and work on more mundane goals
Given that ideal reasoners are not supposed to disagree, it seems likely that most if not all of these alternative suggestions can also be explained by their proponents being less than rational. Looking at myself and my suggestion to work on IA or uploading, I've noticed that I have a tendency to be initially over-optimistic about some technology and then become gradually more pessimistic as I learn more details about it, so that I end up being more optimistic about technologies that I'm less familiar with than the ones that I've studied in detail. (Another example of this is me being initially enamoured with Cypherpunk ideas and then giving up on them after inventing some key pieces of the necessary technology and seeing in more detail how it would actually have to work.)
I'll skip giving explanations for other critics to avoid offending them, but it shouldn't be too hard for the reader to come up with their own explanations. It seems that I can't trust any of the FAI critics, including myself, nor do I think Eliezer and company are much better at reasoning or intuiting their way to a correct conclusion about how we should face the apparent threat and opportunity that is the Singularity. What useful implications can I draw from this? I don't know, but it seems like it can't hurt to pose the question to LessWrong. 



New Comment
49 comments, sorted by Click to highlight new comments since: Today at 12:01 PM

If there are thousands of possible avenues of research, and critics have a noisy lock on truth in the sense of picking a few hundred avenues they like best, then we could easily wind up with all the critics agreeing that strategy X is just a bad idea, but also disagreeing on whether strategy A is better than strategies B or C. So I don't see disagreement among critics as proving much at all other than the critics are not perfect, which they surely would agree with; it doesn't vindicate X. There are so many more wrong research avenues than right ones.

(Imagine we have 10,000 possible research topics, 3 critics who have identified their top 10 strategies, and the critics are guaranteed to identify 'the right' strategy in those 10 but beyond that pick randomly the top 1. If someone picks research topic X which is genuinely wrong, then the critics will almost certainly all agree that that topic is indeed the wrong topic: the number of strategies endorsed by any of them just 28 strategies out of the 10,000 and 28 / 10,000 is a pretty small chance for X to get lucky and be one of them. But at the same time, will the 3 critics all rank the same strategy as the top strategy? 1/10 1/10 1/10 is not great odds either! So even though the critics have an amazing truth-finding ability in being able to shrink 10,000 all the way down to 10, they still may not agree because of their remaining noise.)

it doesn't vindicate X

To be clear, I'm not suggesting that the fact that critics of FAI disagree vindicates FAI.

What I am saying is more like this:

It looks like humans trying to answer "How should we face the Singularity?" are so noisy as to be virtually useless. Crap, now what? It's a long shot, but maybe LW has some ideas about how to extract something useful from all the noise?

(Note that we have disagreement not just about what is the best avenue of research, but also about whether any given approach has positive, negative, or negligible expected utility (compared to doing nothing) so we can't even safely say "let's just collectively make enough money to fund the top N approaches" and expect to be doing some good. ETA: Nor can we take the average of people's answers and use that since much of the noise is probably driven by systemic biases which are not likely to cancel out nicely. Nor is it clear how to subtract out the biases since whoever is trying to do that would most likely be heavily biased themselves relative to the strength of the signal they're trying to extract.)

I criticize FAI because I don't think it will work. But I am not at all unhappy that someone is working on it, because I could be wrong or their work could contribute to something else that does work even if FAI doesn't (serendipity is the inverse of Murphy's law). Nor do I think they should spread their resources excessively by trying to work on too many different ideas. I just think LessWrong should act more as a clearinghouse for other, parallel ideas, such as intelligence amplification, that may prevent a bad Singularity in the absence of FAI.

CEV and understanding recursive self-modification. Everything boils down to those two linked disciplines. The CEV is for understanding what we want, and the recursive self-modification is so that whatever is FOOMing doesn't lose sight of CEV while changes itself. I simply do not trust that anything will FOOM first and then come up with perfect CEV afterwards. By that time it will already be too far removed from humanity. This was, I think, a topic in Eliezer's metaethics sequence. It's the still unanswered question about what to do when you actually have unlimited power, including the power to change yourself.

Every option eventually boils down to a FOOM. This is why CEV and recursive self-modification must be finished before any scenario completes. AI is an artificial life-form FOOM, and may be friendly or unfriendly; uploading is human FOOM, and we already know that they're unfriendly with sufficient power; intelligence amplification is a slower biological FOOM; the first question asked of Oracle/Tool AI will be how to FOOM; and mainstream AGI is trying to build an AI to FOOM, except slower. The only non-FOOM related options are improving laws and institutions, which is already an ethical question, and computer security (and I'm not sure how that one relates to SIAI's mission).

The issue is that both of these are really hard. Arguably every philosopher since ever has been trying to do CEV. Recursive self-modification is hard as well, since humans can barely self-modify our ethical systems as it is. Though, as I understand, CFAR is now working to finding out what it takes to actually change people's minds/habits/ethics/actions.

Edit: But at the end of the day, it doesn't help one bit if SIAI comes up with CEV while the Pentagon or China comes up with uFAI. So starting work on AI is probably a good idea.

uploading is human FOOM, and we already know that they're unfriendly with sufficient power

We do? Consider the Amish, a highly recognizable out-group with very backward technology. Other groups could easily wipe them out and take their stuff, if they so chose. But they seem to be in no particular danger. Now, one can easily come up with explanations that might not apply to uploads: the non-Amish are too diverse to coordinate and press their advantage; their culture overlaps too much with the Amish to make genocide palatable; yada yada. But, why wouldn't those factors also apply to uploads? Couldn't uploads be diverse? Share a lot of culture with bio humans? Etc.

The Amish are surprisingly wealthy, likely a profit center for neighbors & the government due to their refusal to use government services but still paying taxes, and are not (yet) disturbingly large proportions of the population.

They are also currently a case of selection bias: there are many countries where recognizable out-groups most certainly have fallen prey to unFriendly humans. (How many Jews are there now in Iran, or Syria, or Iraq? How are the Christians doing in those countries? Do the Copts in Egypt feel very optimistic about their future? Just to name some very recent examples...)

For that matter, when you think of the Amish and the other "Swiss Bretheren" religions, why do you think "Pennsylvania" rather than "Switzerland and neighboring countries"? A sect that had to cross oceans to find a state promising religious freedom is our best example of humans' high tolerance for diversity?

Yes, that's a good point, although now that I think about it I don't actually know what happened to the 'original' Amish. The Wikipedia Swiss Brethren mentions a lot of persecution, but it also says they sort of survive as the Swiss Mennonite Conference; regardless, they clearly don't number in the hundreds of thousands or millions like they do in America.

It's not just CEV and recursive self-modification, either. CEV only works on individuals and (many) individuals will FOOM once they acquire FAI. If individuals don't FOOM into FAI's (and I see no reason that they would choose to do so) we need a fully general moral/political theory that individuals can use to cooperate in a post-singularity world. How does ownership work when individuals suddenly have the ability to manipulate matter and energy on much greater scales than even governments can today? Can individuals clone themselves in unlimited number? Who actually owns solar and interstellar resources? I may trust a FAI to implement CEV for an individual but I don't necessarily trust it to implement a fair universal political system; that's asking me to put too much faith in a process that won't have sufficient input until it's too late. If every individual FOOMed at the same rate perhaps CEV could be used over all of humanity to derive a fair political system, but that situation seems highly unlikely. The most advanced individuals will want and need the most obscure and strange sounding things and current humans will simply be unable to fully evaluate their requests and the implications of agreeing to them.

I think we have the same sentiment, though we may be using different terminology. To paraphrase Eliezer on CEV, "Not to trust the self of this passing moment, but to try to extrapolate a me who knew more, thought faster, and were more the person I wished I were. Such a person might be able to avoid the fundamental errors. And still fearful that I bore the stamp of my mistakes, I should include all of the world in my extrapolation." Basically, I believe there is no CEV except the entire whole of human morality. Though I do admit that CEV has a hard problem in the case of mutually conflicting desires.

If you hold CEV a personal rather than universal, then I agree that the SIAI should work on that 'universal CEV', whatever it be named.

I just re-read EY's CEV paper and noticed that I had forgotten quite a bit since the last time I read it. He goes over most of the things I whined about. My lingering complaint/worry is that human desires won't converge, but so long as CEV just says "fail" in that case instead of "become X maximizers" we can potentially start over with individual or smaller-group CEV. A thought experiment I have in mind is what would happen if more than one group of humans independently invented FAI at the same time. Would the FAIs merge, cooperate, or fight?

I guess I am also not quite sure how FAI will actively prevent other AI projects or whole brain simulations or other FOOMable things, or if that's even the point. I guess it may be up to humans to ask the FAI how to prevent existential risks and then implement the solutions themselves.

To add to your list of various alternatives... My personal skepticism re SI is the apparent lack of any kind of "Friendly AI roadmap", or at least nothing that I could easily find on the SI site or here. (Could be my sub-par search skills, of course.)

I hear Eliezer is planning to start writing a sequence on open problems in Friendly AI soon.

That's a different task... I'd expect to see something like "in phase one we plan to do this during this timeframe, next, depending on the outcome of phase one, we plan to proceed along the following lines, which we expect will take from m to n years...", rather than a comprehensive list of all open problems. The latter is hard and time consuming, the former is something that should not take longer than a page written down, at least as a first draft.

I don't think it would be reasonable to develop such a roadmap at this point, given that it would require having a relatively high certainty in a specific plan. But given that it's not yet clear whether the best idea is to proceed on being the first one to develop FAI, or to pursue one of the proposals listed in the OP, or to do something else entirely, and furthermore it's not even clear how long it will take to figure that out, such specific roadmaps seem impossible.

And this vagueness pattern-matched perfectly to various failed undertakings, hence my skepticism.

In my model, it also pattern-matches with "Fundamental research that eventually gave us Motion, Thermodynamics, Relativity, Transistors, etc."

This looks somewhat like what you're asking for, although it does leave a bit to be desired.

No, it does not at all look like a roadmap. This is a roadmap. Concrete measurable goals. The strategic plan has no milestones and no timelines.

Timelines don't work very well on such slippery topics. You work at it until you're done. Milestones are no less necessary, for sure.

Apparently they are figuring out that roadmap, a problem with any research, even non-academic ones. I assume the sub-topics discussed so far are central points in question, but if SI keep in secret some insights, well, it's for the sake of security, or lack of transparency. In some cases, people bothers to much about the institution.

Kill all those birds with one stone: Work on understanding and preventing the risks. Here's why that stone kills each bird:

Re: Work on some aspect of building an AI

Even the sequences are against this. "Do not propose solutions until the problem has been discussed as thoroughly as possible" Working on risk should go first. Also:

  • Somebody needs to judge the safety of AIs.

    Including any that you guys would make. If not SIAI, who will do this at all?

  • SIAI can't do both safety and production, it's perverse.

    The people who MAKE the AI should NOT be the same people who JUDGE the AI for the same reason that I would not purchase medical treatments from a doctor who claimed to test them himself but did not do a scientific study. You cannot peer review, independently test or check and balance your own project. Those that want to make an AI should plan to be part of a different organization, either now or in the future.

  • You'll get speed and quality advantages this way.

    If you build the AI first, you will certainly consider safety problems that hadn't been obvious before because you'll be in the thick of all those details and they'll give you new ideas. But those can always be added to the list of safety guidelines at that time. There is no reason to do that part first. If you make safety guidelines first, you can build the AI with safety in mind from the ground up. As you know, reprogramming something that has a flaw in a critical spot can be very, very time-consuming. By focusing on safety first, you will have a speed advantage while coding as well as a quality advantage. Others will make dangerous AIs and be forced to recall them and start over. So, this is a likely advantage.

Re: "improve laws and institutions"

You need to understand the risks thoroughly before you will be able to recommend good laws, and before people will listen to you and push for any legislation. After you understand the risks, you'd need to work on improving laws so that when the interested people go to build their AI there's a legal framework there.

Re: "computer security"

This should be included under "risk research and prevention." because it's definitely a risk. There are likely to be interactions between security, and other risks that you'd want to know about at the same time as working on the security, it's all connected, and you may not discover these if you don't think about them at the same time.

Re: "stop worrying about the Singularity and work on more mundane goals"

Considering the personalities, capabilities and prior investments of those involved, this simply isn't likely to happen. They need to be ambitious. Ambitious people need the assistance of others who are more specialized in mundane tasks and would be happy to help them so that they can focus on ambitions - we all specialize.

Focusing on risk research and prevention is also the first step to everything else:

  • How will you get funding for something people see as risky?

  • How will you develop AI in a world destroyed by other's AI projects that SIAI didn't take the time to stop?

  • How will SIAI develop credibility and trust if it doesn't prove it's capable of intellectual rigor by doing a thorough job of risk prevention? This entire industry has no trust. Even as an AI project, you'll have no trust for that reason.

  • How will SIAI prove it is effective in the world if it doesn't do something before making an AI such as change some laws, and do risk prevention?

  • Who is going to be there to independently test your AI project if you choose to do that instead?

I don't think the solution is "Do some, not others." I think it is "Do them in the right order." and for which type of AI project to chose, wouldn't it be safer to decide AFTER you research risks as thoroughly as possible?

Additionally, if SIAI chooses to dedicate itself to risk research and prevention and agrees that the AI building activities should be split off into a different group, I'd be interested in doing some volunteer work for the risk research and prevention group, especially regarding preventing an AGI arms race. I think the ideas I explain there or similar ones would be a really good way for you to prove that SIAI is capable of actually doing something, which addresses a common objection to funding SIAI.

See any way to break the above line of reasoning and argue for a different route? If so, I will attempt to resolve those conflicts also.

ideal reasoners are not supposed to disagree

My ideal thinkers do disagree, even with themselves. Especially about areas as radically uncertain as this.

Given that ideal reasoners are not supposed to disagree, it seems likely that most if not all of these alternative suggestions can also be explained by their proponents being less than rational.

Please support this statement with any kind of evidence. From where I sit it looks to be simply an error.

As far as I know values do not come from reason, they are a "given" from the point of view of reason. So if I value paper clips and you value thumbtacks, we can be as rational as all get out and still disagree on what we should do.

Further, I think even on matters of "fact," it is not "ideal reasoners" who do not disagree, rather it is reasoners who have agreed on one of a few possible methodologies of reasoning. So I think I have seen the statement, something like "rational bayesians cannot agree to disagree on probability estimates."

Given that ideal reasoners are not supposed to disagree,

Under some non-trivial assumptions.

it seems likely that most if not all of these alternative suggestions can also be explained by their proponents being less than rational.

That sounds pretty much condescending. These suggestions are not all mutually exclusive, and proponents with different values might have different preferences without being "less than rational".

About agreement: for the agreement we need all our evidence to be shareable, and our priors to be close enough. Actual evidence (or hard-to-notice inferences) about possibility of significantly super-human AGI on reasonable hardware cited in the Sequences are quite limited, and not enough to overcome difference in priors.

I do think humanity will build slightly super-human AGI, but as usual with computers it will mimc our then-current idea of how human brain actually works and then be improved as the design allows. In that direction, HTM (as done by Jeff Hawkins via his current Numenta startup) may end up polished into a next big thing in machine learning or a near-flop with few uses.

Also, it is not clear that people will ever get around to building general function-optimizing AI. Maybe executing behaviours will end up being the way to safeguard AI from wild decisions.

New to LessWrong?