This summer, UChicago XLab is running two programs: 1. Summer Research Fellowship: Fellows pursue novel research directions in AI safety and nuclear security. 2. Second Look Fellowship: Fellows complete replications of load-bearing AI safety papers. The deadline for the Second Look Fellowship has passed, but we will continue to review...
If we get AI safety research wrong, we may not get a second chance. But despite the stakes being so high, there has been no effort to systematically review and verify empirical AI safety papers. I would like to change that. Today I sent in funding applications to found a...
I spent a few hundred dollars on Anthropic API credits and let Claude individually research every current US congressperson's position on AI. This is a summary of my findings. Disclaimer: Summarizing people's beliefs is hard and inherently subjective and noisy. Likewise, US politicians change their opinions on things constantly so...
This work was supported by UChicago XLab. Today, we are announcing our first major release of the XLab AI Security Guide: a set of online resources and coding exercises covering canonical papers on jailbreaks, fine-tuning attacks, and proposed methods to defend AI systems from misuse. Each page on the course...
The barriers between us and what we want are often entirely imagined. It is true: you can learn how to paint, change careers, write a paper or run a marathon. These things are hard, but we shouldn’t pretend that they are impossible. You can just do them. But this mindset...
From the UChicago XLab AI Security Team: Zephaniah Roe, Jack Sanderson, Julian Huang, Piyush Garodia Correspondence to team@xlabaisecurity.com. For the best reading experience, we recommend viewing on our website. Methodology note: This red-teaming sprint may include small metric inaccuracies (see Appendix). Summary We red-teamed gpt-oss-20b and found impressive robustness but...