Can activation verbalizers surface an internal chain of thought?
We introduce an evaluation for activation verbalizers: can they surface a target model's reasoning as it solves a math problem in a single forward pass? For open-weight NLAs, the answer seems to be: "possibly, but definitely not reliably". Lots of important capabilities currently require AI models to reason "out loud"...
Jun 7103