This work was supported by UChicago XLab.
Today, we are announcing our first major release of the XLab AI Security Guide: a set of online resources and coding exercises covering canonical papers on jailbreaks, fine-tuning attacks, and proposed methods to defend AI systems from misuse.
Each page on the course contains a readable blog-style overview of a paper and often a notebook that guides users through a small replication of the core insight the paper makes. Researchers and students can use this guide as a structured course to learn AI security step-by-step or as a reference, focusing on specific sections relevant to their research. When completed chronologically, sections build on each other and become more advanced... (read 1265 more words →)