Quick & minor heads up: the second hyperlink in footnote 1 seems to be broken, and is I think meant to point to this page instead (though the broken one did lead me to discover Anthropic's cute Error 404 page so no complaints)
Also, later in the Limitations & Confoundings section, a paragraph ends with an unformatted [^1]. (Specifically, "Recent Anthropic research demonstrates that models can express aligned preferences while reasoning misalignedly internally, a practical reminder that self-report evals have fundamental limits.[^1]") Is that meant to point to the same citation as footnote 1? Would be really curious to know what you're citing there.
Quick & minor heads up: the second hyperlink in footnote 1 seems to be broken, and is I think meant to point to this page instead (though the broken one did lead me to discover Anthropic's cute Error 404 page so no complaints)
Also, later in the Limitations & Confoundings section, a paragraph ends with an unformatted [^1]. (Specifically, "Recent Anthropic research demonstrates that models can express aligned preferences while reasoning misalignedly internally, a practical reminder that self-report evals have fundamental limits.[^1]") Is that meant to point to the same citation as footnote 1? Would be really curious to know what you're citing there.