x

LESSWRONG

LW

edoinni — LessWrong

edoinni

edoinni

Message

5

1

7mo

edoinni

5

7mo

Analysing CoT alignment in thinking LLMs with low-dimensional steering

In recent years, LLM reasoning models have become quite popular due to their ability to solve tasks beyond the reach of standard models, by extending their prediction window to much longer lengths and allowing them to perform more involved computation in this way. As a secondary use of the model’s...