(Continuing the posting of select posts from Slate Star Codex for comment here, for the reasons discussed in this thread, and as Scott Alexander gave me - and anyone else - permission to do with some exceptions.)

Scott recently wrote a post called No Time Like The Present For AI Safety Work. It makes the argument for the importance of organisations like MIRI thus, and explores the last two premises:

1. If humanity doesn’t blow itself up, eventually we will create human-level AI.

2. If humanity creates human-level AI, technological progress will continue and eventually reach far-above-human-level AI

3. If far-above-human-level AI comes into existence, eventually it will so overpower humanity that our existence will depend on its goals being aligned with ours

4. It is possible to do useful research now which will improve our chances of getting the AI goal alignment problem right

5. Given that we can start research now we probably should, since leaving it until there is a clear and present need for it is unwise

I placed very high confidence (>95%) on each of the first three statements – they’re just saying that if trends continue moving towards a certain direction without stopping, eventually they’ll get there. I had lower confidence (around 50%) on the last two statements.

Commenters tended to agree with this assessment; nobody wanted to seriously challenge any of 1-3, but a lot of people said they just didn’t think there was any point in worrying about AI now. We ended up in an extended analogy about illegal computer hacking. It’s a big problem that we’ve never been able to fully address – but if Alan Turing had gotten it into his head to try to solve it in 1945, his ideas might have been along the lines of “Place your punch cards in a locked box where German spies can’t read them.” Wouldn’t trying to solve AI risk in 2015 end in something equally cringeworthy?

As always, it's worth reading the whole thing, but I'd be interested in the thoughts of the LessWrong community specifically.


New Comment
13 comments, sorted by Click to highlight new comments since: Today at 2:39 PM

I think Scott's argument is totally reasonable, well-stated and I agree with his conclusion. So it was pretty dismaying to see how many of his commenters are dismissing the argument completely, making arguments which were demolished in Eliezer's OB sequences.

Some familiar arguments I saw in the comments:

  1. Intelligence, like, isn't even real, man.
  2. If a machine is smarter than humans, it has every right to destroy us.
  3. This is weird, obviously you are in a cult.
  4. Machines can't be sentient, therefore AI is impossible for some reason.
  5. AIs can't possibly get out of the box, we would just pull the plug.
  6. Who are we to impose our values on an AI? That's like something a mean dad would do.

There's also better arguments, like

"We wouldn't build a god AI and put it in charge of the world"

"We would make some sort of attempt at installing safety overrides"

" Tool AI is safer and easier, and easier to make safe, and wouldn't need goals to be aligned with ours"

"Well be making ourselves smarter in parallel"

I think point 1 is very misleading, because while most people agree with it, hypothetically a person might assign 99% chance of humanity blowing itself up before strong AI, and < 1% chance of strong AI before the year 3000. Surely even Scott Alexander will agree that this person may not want to worry about AI right now (unless we get into Pascal's mugging arguments).

I think most of the strong AI debate comes from people believing in different timelines for it. People who think strong AI is not a problem think we are very far from it (at least conceptually, but probably also in terms of time). People who worry about AI are usually pretty confident that strong AI will happen this century.

In my experience the timeline is not usually the source of disagreement. They usually don't believe that AI would want to hurt humans. That the paperclip maximizer scenario isn't likely/possible. E.g. this popular reddit thread from yesterday.

I guess that would be premise number 3 or 4, that goal alignment is a problem that needs to be solved.

Yeah, you're probably right. I was probably just biased because the timeline is my main source of disagreement with AI danger folks.

My reading of that article is:

"I am stumping for my friends."

So are you claiming he doesn't really believe his argument?

I am saying he wrote that article because his friends asked him to. You are asking the wrong person about Scott's beliefs.

I wasn't asking you about his beliefs, I was asking about what implication you were making. We already know what Scott says he believes; unless you doubt he is being honest there is no reason to assume he is stumping for his friends rather than advocating his own beliefs.

I am not sure what you are asking. I don't think Scott is an evil mutant, he wouldn't just cynically lie, I don't think. AI risk is not one of his usual blog topics, however.

I think you are underestimating the degree to which personal truth is socially constructed, and in particular influenced by friends.

He doesn't talk about AI as often as, say, psychiatry, but he talks about it with some frequency.


In particular, Meditations on Moloch makes it pretty clear that he takes AI seriously.

I don't think that so high estimate for first statement is reasonable.

Also, link now leads to bicameral reasoning article.