x

LESSWRONG

LW

theashwinner — LessWrong

theashwinner

theashwinner

Message

7

3

5y

theashwinner

7

5y

Basic Legibility Protocols Improve Trusted Monitoring

by SebastianP and theashwinner

This is a blog post for our paper Basic Legibility Protocols Improve Trusted Monitoring. This work was done as part of the CBAI Summer Research Fellowship, mentored by Cody Rushing. In this blog post, we’ll go over the main results, and focus more on the engineering and methodological takeaways than...