Eliezer Yudkowsky is a research fellow of the Singularity Institute for Artificial Intelligence
- which he co-founded in 2001. He is mainly concerned with the obstacles and importance of developing a Friendly AI, such as a reflective decision theory that would lay a foundation for describing fully recursive self modifying agents that retain stable preferences while rewriting their source code. He also co-founded Less Wrong, writing most part of The Sequences, long sequences of posts dealing with epistemology, AGI, metaethics, rationality and so on.
- “Cognitive Biases Potentially Affecting Judgment of Global Risks” (2008): A pioneer compilation of
Cognitive Biases – systematic deviations from rationality – influencing our judgment of Global Catastrophic Risks. These are defined as risks with the potential to inﬂict serious damage to human well-being on a global scale, threatening more than millions of human lives(e.g.: volcanic eruptions, pandemic infections, nuclear accidents, worldwide tyrannies, out-of-control scientific experiments, climatic changes, cosmic hazards and economic collapse). It is a book chapter from a larger tome analyzing those risks.
- “AI as a Positive and Negative Factor in Global Risk. (2008)”:
Chapter of the same book of the previous paper, it analyses possible philosophical and technical failures in the construction of a Friendly AI, which could lead to an Unfriendly AI posing a enormous global risk. He also discusses how a Friendly AI could help decrease some Global Risks discussed in the book. Finally, because a powerful AI could go from been a Global Risk to help reduce some of them, he argues that researching such topic is extremely important.
- "Creating Friendly AI"(2001): One of the first articles to address the challenges in designing the features and cognitive architecture required to produce a benevolent
- "Friendly" - Artificial Intelligence . It also gives one of the first precise definitions of terms such as Friendly AI and Seed AI.
- "Levels of Organization in General Intelligence" (2002):
Analyses AGI through its decomposition in five subsystems, successive levels of functional organization: Code, sensory modalities, concepts, thoughts, and deliberation. It also discusses some advantages artificial minds would have, such as the possibility of Recursive self-improvement.
- "Coherent Extrapolated Volition"(2004): Presents the difficulties and possible solutions for incorporating friendliness into an AGI. It
proposes that making an AGI doing what we tell it to could be dangerous, since we don`t know what we want. Instead we should program the AGI to do what we want, predicting what the vectorial sum of an idealized version of us would want, "if we knew more, thought faster, were more the people we wished we were, had grown up farther together”. He calls this the coherent extrapolated volition of humankind, or CEV.
- "Timeless Decision Theory" (2010): Describes Timeless decision theory,”an extension of causal decision networks that compactly represents uncertainty about correlated computational processes and represents the decision maker as such a process”. It solves many problems which Causal Decision Theory or Evidential Decision Theory don't have a plausible solution: Newcomb's problem, Solomon's Problems and Prisoner's dilemma.
- Value Systems are Required to Realize Valuable Futures" (2011): Discusses the Complexity of value: we can’t come up with a simple rule or description that sums up all human values. It
analyses how this problem makes it difficult to build a valuable future.