Is anyone doing a real-world test of agentic misalignment?
This seems like a no-brainer from: 1. a capabilities viewpoint — actually test a model's ability and propensity to use physical affordances; and 2. an optics or advocacy viewpoint — this would surely capture more attention than previous results. I'm envisaging a test environment with a couple of rooms, where...