Hi, I've been asked to recommend a couple of short introductions/overviews about the key issues in AI safety and AI alignment. This is will be for the 'Philosophy, Politics, & Economics (PPE) major at Oxford University - which trains some of the brightest undergrads in Britain, many of which go...
Lately I’ve been trying to raise awareness of AI risks among American conservatives. Stopping the reckless development of advanced AI agents (including Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI)) should be a human issue, not a partisan issue. Yet most people working in AI safety advocacy lean to the...
(Note: this was also published on EA Forum, May 26, 2023, here; this is a somewhat rough-and-ready essay, which I'll expand and deepen later; I'm welcome any comments, reactions, and constructive criticism.) Overview: To clarify how we might align AI systems with humans, it might be helpful to consider how...
Note 1 : This was posted to EA Forum on May 31 here, as a submission for the 2023 Open Philanthropy AI Worldviews contest, due May 31, 2023. It addresses Question 1: “What is the probability that AGI is developed by January 1, 2043?” Note 2: I'm developing a series...
Overview (TL;DR): Shard Theory is a new approach to understanding the formation of human values, which aims to help solve the problem of how to align advanced AI systems with human values (the ‘AI alignment problem’). Shard Theory has provoked a lot of interest and discussion on LessWrong, AI Alignment...
Note: This essay was published here on EA Forum on Sept 21, 2022. The description of the brain-over-body biases in the EA subculture may or may not apply to the Rationalist subculture in LessWrong. This essay builds upon this essay on the heterogeneity of human value types. Overview Most AI...
Note: This essay was originally posted on EA Forum here on Sept 16. I’d welcome comments from LessWrong readers and AI Alignment Forum experts. I have posted some related essays on EA Forum about the importance for AI alignment of considering corporal/body values, religious values, and the diversity of values...