Sycophancy Under Pressure: A Small ,Controlled Probing Study on Open Models
A bit of personal context:I am doing this on an 8gb RAM windows 10, working mostly on powershell with a goal of building real, demostrable interpretability work without formal ML background , If anything below looks beginner level, it probably is. TL;DR I trained linear probes on the residual stream...
Jun 211