Recommended Reading for Friendly AI Research



This post enumerates texts that I consider (potentially) useful training for making progress on Friendly AI/decision theory/metaethics.

Rationality and Friendly AI

Eliezer Yudkowsky's sequences and this blog can provide solid introduction to the problem statement of Friendly AI, giving concepts useful for understanding motivation for the problem, and disarming endless failure modes that people often fall into when trying to consider the problem.

For a shorter introduction, see

Decision theory

The following book introduces an approach to decision theory that seems to be closer to what's needed for FAI than the traditional treatments in philosophy or game theory:

  • G. L. Drescher (2006). Good and Real: Demystifying Paradoxes from Physics to Ethics (Bradford Books). The MIT Press, 1 edn.

Another (more technical) treatment of decision theory from the same cluster of ideas:

Following posts on Less Wrong present ideas relevant to this development of decision theory:


The most relevant tool for thinking about FAI seems to be mathematics, where it teaches to work with precise ideas (in particular, mathematical logic). Starting from a rusty technical background, the following reading list is one way to start:

[Edit Nov 2011: I no longer endorse scope/emphasis, gaps between entries, and some specific entries on this list.]

  • F. W. Lawvere & S. H. Schanuel (1991). Conceptual mathematics: a first introduction to categories. Buffalo Workshop Press, Buffalo, NY, USA.
  • B. Mendelson (1962). Introduction to Topology. College Mathematics. Allyn & Bacon Inc., Boston.
  • P. R. Halmos (1960). Naive Set Theory. Springer, first edn.
  • H. B. Enderton (2001). A Mathematical Introduction to Logic. Academic Press, second edn.
  • S. Mac Lane & G. Birkhoff (1999). Algebra. American Mathematical Society, 3 edn.
  • F. W. Lawvere & R. Rosebrugh (2003). Sets for Mathematics. Cambridge University Press.
  • J. R. Munkres (2000). Topology. Prentice Hall, second edn.
  • S. Awodey (2006). Category Theory. Oxford Logic Guides. Oxford University Press, USA.
  • K. Kunen (1999). Set Theory: An Introduction To Independence Proofs, vol. 102 of Studies in Logic and the Foundations of Mathematics. Elsevier Science, Amsterdam.
  • P. G. Hinman (2005). Fundamentals of Mathematical Logic. A K Peters Ltd.