Finding "misaligned persona" features in open-weight models — LessWrong