LESSWRONG
LW

lisathiergart
563100
Message
Subscribe to posts

Posts

Sorted by New
95ActAdd: Steering Language Models without Optimization
Ω
19d
Ω
3
39Open problems in activation engineering
Ω
2mo
Ω
2
47Distillation of Neurotech and Alignment Workshop January 2023
4mo
7
392Steering GPT-2-XL by adding an activation vector
Ω
4mo
Ω
94
101Maze-solving agents: Add a top-right vector, make the agent go to the top-right
Ω
6mo
Ω
17

Wiki Contributions

No wiki contributions to display.

Comments

No Comments Found