Do sparse autoencoders find "true features"? — LessWrong