Cool paper! I enjoyed reading it and think it provides some useful information on what adding carefully chosen bias vectors into LLMs can achieve. Some assorted thoughts and observations.
Some other very minor comments:
To be clear, the authors don't claim this and I'm not intending this as a criticism of them.
My summary of the paper: