x

LESSWRONG

LW

jacoba — LessWrong

jacoba

jacoba

Message

61

11mo

jacoba

61

11mo

White Box Control at UK AISI - Update on Sandbagging Investigations

by Joseph Bloom, Jordan Taylor, Connor Kissane, Sid Black, merizian, alexdzm, jacoba, Ben Millwood, and Alan Cooney

Introduction Joseph Bloom, Alan Cooney This is a research update from the White Box Control team at UK AISI. In this update, we share preliminary results on the topic of sandbagging that may be of interest to researchers working in the field. The format of this post was inspired by...

Jul 10, 2025•80