Thanks to the Open Source Mechanistic Interpretability Slack for their feedback.

When going through Neel's excellent 200 Concrete Problems In Interpretability sequence, I found myself thinking "It sure would be great to have a single document I could go through to see all the problems, and ideally sort them by difficulty as well."

So I made a spreadsheet! Problems are numbered by category, a feature which should be added to the main sequence fairly soon as well.

The primary purpose of the spreadsheet is to let people quickly scan through to find problems that they're interested in before diving more deeply. The secondary purpose of it is to let people avoid duplication of work / find collaborators, so if you know of work that's already been done on some of these problems, or you're actively working on one and want help, please leave a comment on the sheet so I can add it in!

New Comment