Exploring Reinforcement Learning Effects on Chain-of-Thought Legibility
This project was conducted as part of the SPAR Fall 2025 cohort. TL;DR * Chain-of-thought (CoT) monitoring may serve as a core pillar for AI safety if further advancements in AI capabilities do not significantly degrade the monitorability of LLM serial reasoning. * As such, we studied the effects of...
Jan 641