x
Experiments on Refusal Shape in LLMs — LessWrong