x
Programming Refusal with Conditional Activation Steering — LessWrong