x
Try training token-level probes — LessWrong