Intricacies of Feature Geometry in Large Language Models — LessWrong