Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small — LessWrong