Eval-unawareness ≠ Eval-invariance
New frontier models have developed the capability of eval-awareness, putting the utility of evals at risk. But what do people really mean when they say “a model is eval-aware?” In this post, I try to disentangle this statement and offer my views on the different concepts it entails. When people...
Dec 5, 202526