How well do truth probes generalise? — LessWrong