This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Mechanistic Interpretability Puzzles
LW
Login
Mechanistic Interpretability Puzzles
65
Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo
Ω
Neel Nanda
9mo
Ω
15
40
Mech Interp Puzzle 2: Word2Vec Style Embeddings
Ω
Neel Nanda
9mo
Ω
4