NYU Debate Training Update: Methods, Baselines, Preliminary Results
[This writeup reflects work done jointly with David Rein and Julian Michael at NYU's Alignment Research Group] Introduction In the past year, there have been a number of projects aimed at validating the basic premises behind debate as a mechanism for scalable oversight (see here, here, and here). One important...
Jul 6, 20249