x
Trained steering vectors may work as activation oracles — LessWrong