x

AI oversight and its limitations

By AI oversight, we refer to things like evaluation, monitoring, staged deployment, or interpretatiblity. This sequence is meant as a loose collection of posts, aimed at understanding how well AI oversight works when applied to AI that reasons strategically. And at mapping out where oversight fails, such that we know where we need to use other tools instead.

AI oversight and its limitations

By AI oversight, we refer to things like evaluation, monitoring, staged deployment, or interpretatiblity. This sequence is meant as a loose collection of posts, aimed at understanding how well AI oversight works when applied to AI that reasons strategically. And at mapping out where oversight fails, such that we know where we need to use other tools instead.

25If This Were a Test, How Much Would It Cost?

VojtaKovarik, Tomáš Gavenčiak

2d

9