Lessons from Studying Two-Hop Latent Reasoning — LessWrong