Crossposted from the EA Forum
If you want to communicate AI risks in a way that increases concern, our new study says you should probably use vivid stories, ideally with identifiable victims.
Tabi Ward led this project as her honours thesis. In the study, we wrote short stories about different AI risks like facial recognition bias, deepfakes, harmful chatbots, and design of chemical weapons. For each risk, we created two versions: one focusing on an individual victim, the other describing the scope of the problem with statistics.
We had 1,794 participants from the US, UK and Australia read one of the stories, measuring their concern about AI risks before and after. Reading any of the stories increased concern.... (read 533 more words →)
Has anyone considered (or already completed) testing models to see how well they can answer questions about this filtered data?
Consider an eval that tests how well models' awareness of the alignment literature. "Models know everything about misaligned AI, including benefits and strategies for scheming" seems like a useful thing for us to know? Or is it a foregone conclusion that they'd saturate this benchmark anyway?