A Geometric Account of Activation Steering through Angle–Norm Decomposition
by Atmyre and Georgii Aparin
This blog post provides an overview of our recent paper: A Geometric Account of Activation Steering through Angle–Norm Decomposition. TL;DR: We decompose linear activation steering into two distinct operations: one that changes the angle of the activation toward a concept direction, and one that changes its norm. Through controlled experiments,...
Jun 179