I Bet Abliteration's Cost Was Sloppy Implementation. I Was Wrong
Models refuse. They can refuse on the basis of lack of knowledge, predetermined guardrails, etc. We can see both closed-weight and open-weight models refuse. But, open-weight models are, well, open. So enthusiasts have developed techniques to leverage (and edit) the mechanics of the model to avoid refusal. One such technique...
Jun 146