Fact Finding: Do Early Layers Specialise in Local Processing? (Post 5) — LessWrong