Factored Cognition Strengthens Monitoring and Thwarts Attacks
This blog post has been superseded by our paper Factor(T,U): Factored Cognition Strengthens Monitoring of Untrusted AI. We leave all content here unchanged for reference only. Note: This post is essentially a ~90% draft of our upcoming research paper. We are sharing it now to get feedback on our empirical...
Jun 18, 202529