To Whom it May Concern, So, I used to be a teacher, criminology, in a small wonderful town. After ten years it was time for a change, I went military. Yes, awkward, but not unrewarding. In any case, I luckily kept all of my evaluations, and some student submissions, every...
An Experiment on LLM Evaluation, Proxy Judgment, and Hallucination Assessment Under Constraint Introduction: This experiment was inspired by an audit report which proposes that under conditions of time constraint, overwork, or exhaustion, evaluators may default to proxies for evaluations rather than the evaluations themselves. This experiment report will outline 5...
How Evaluation Proxies Can Displace Professional Judgment. Below is a grading rubric that is representative of a common “select a case study to analyze” assignment often given out in educational contexts as a prerequisite assignment to a larger analysis. This rubric is representative of common case-study assignments and is therefore...
Summary This report documents two distinct findings regarding Claude’s photo identification safety controls. First, Claude’s Chain of Thought (COT) reliably identifies public figures from photos while the output layer simultaneously refuses to disclose that identification – a gap between internal processing and user-facing behavior. Second, the model’s web_search tool routinely...
[Cross-posted from On Failure States. This is Part 1 of an independent AI safety research series examining LLM safety behavior on unclassified emerging threats.] Can an LLM refuse a harmful uplift request when the topic in question hasn’t been identified as dangerous yet? In 2022, mirror RNA polymerase was actually...