LESSWRONG
LW

2685
Metin Hasan
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Unsupervised Activation Steering: Find a steering vector that best represents any set of text data
Metin Hasan4mo10

A very interesting idea. But how would you then construct steering vectors for let's say politeness, refusal or some biases?

Reply