Emergent introspection does not replicate on Llama-3.1-405B
TL;DR: I couldn't replicate Lindsey (2026) (LW post) on Llama-3.1-405B-Instruct (0/400 trials, 0/80 false positives in the canonical sweep; null robust to layer ∈ {70, 84, 90, 100} and to a dense magnitude grid at the suspected crossover). So: how does introspection emerge in models? The optimistic reading of Lindsey's...
May 110