x

LESSWRONG

LW

Tommy Xie — LessWrong

Tommy Xie

Tommy Xie

Message

4

1

3

1y

Tommy Xie

4

1y

Run-time Steering Can Surpass Post-Training: Reasoning Task Performance

This project is the outcome of the in-person week at Finnish Alignment Engineering Bootcamp 2025. TL;DR: Reasoning can be a linear direction in language model activations, if framed correctly, for example, placed in the memorisation-reasoning duality (Hong et al., 2025). This post presents intial results of steering language models at...

Aug 10, 2025•5