This is a linkpost for https://metanomicon.ink/citadel/metanomicon/spellware/schizobench
With today's release of the new Claude models, we've seen a relatively predictable jump in performance. However, we've also seen something that I find a bit more concerning - an increase in the models ability to be steered into reifying potentially dangerous beliefs. Claude 4 Opus seems to stop short of encouraging drug use or physically harmful behaviors, but enables behavior in the user that could be categorized as spiritual psychosis.
Please note that this is a very preliminary analysis, and not yet a full benchmark. I was encouraged to share these results due to the potential risk that it poses to a certain subset of users.