Labelling, Variables, and In-Context Learning in Llama2
Hi LessWrong! This is my first LessWrong post sharing my first piece of mechanistic interpretability work. I studied in-context learning in Llama2. The idea was to look at when we associate two concepts in the LLM's context — an object (e.g. "red square"), and a label (e.g. "Bob"), how is...
Aug 3, 20246