I used this repo to partially replicate Anthropic's Emotion Concepts paper in a day
tl;dr This post introduces the traitinterp repo I used to partially replicate Anthropic's Emotion Concepts paper on Llama 3.3 70B Instruct. github.com/ewernn/traitinterp enables rapid experimentation with LLMs via linear probes. Emotion Concepts replication write-up is available --> here <-- replication guide here Figure 0: Screenshot from replication write-up (btw, the...
Apr 2110