The linked document provides my summaries for most core readings and many further readings of the alignment fundamentals curriculum composed by Richard Ngo, as accessed from July to early September 2022. Additionally, it often contains my preliminary opinions on the texts. Note that I’m not an expert on the topic.
I have read all texts while simultaneously doing full-time work unrelated to AI alignment, and thus, due to time constraints, many summaries probably contain mistakes, and my opinions would change upon further reflection. Additionally:
Nevertheless, I was told that these summaries are useful, and therefore I’m sharing them with the wider community of people interested in alignment.
If anyone wants to contribute their own summary, please put a suggestion into the google doc, and I will accept it with an attribution to the (optionally anonymous) author.
Acknowledgments: I want to thank Albert Garde, Benjamin Kolb, Fritz Dorn, Jens Brandt, and Tom Lieberum for discussions on the curriculum.
If you do AI policy, this is a great way to quickly skill up at explaining alignment, and also quickly skilling up on AI itself.
So far, the summaries are only "tested" by people who have worked through the whole curriculum themselves. They used the summaries to check their understanding of the articles and contrast their view with mine.
So I'm not yet confident that someone could just read my summaries without at the same time going through the full articles, but it seems worth a try.
Thank you for writing this. I plan on working through it over the next couple weeks to fill in gaps in my previous alignment-related knowledge.