To get provable friendly AI you'll need provable friendly humans.

Not actually true.

Not actually true.

Technically it isn't of course. But I don't expect unfriendly humans not to show me friendly AI but actually implement something else. What I meant is that you'll need friendly humans to not end up with some trickster who takes your money and in 30 years you notice that all he has done is to code some chat bot. There are a lot of reasons that the trustworthiness of the humans involved is important. Of course, provable friendly AI is provable friendly no matter who coded it.

A Brief Overview of Machine Ethics

by lukeprog 1 min read5th Mar 201191 comments


Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Previously, I provided an overview of formal epistemology, that field of philosophy that deals with (1) mathematically formalizing concepts related to induction, belief, choice, and action, and (2) arguing about the foundations of probability, statistics, game theory, decision theory, and algorithmic learning theory.

Now, I've written Machine Ethics is the Future, an introduction to machine ethics, the academic field that studies the problem of how to design artificial moral agents that act ethically (along with a few related problems). There, you will find PDFs of a dozen papers on the subject.