x

LESSWRONG

LW

samarnesen — LessWrong

samarnesen

samarnesen

Message

12

1

1

2y

samarnesen

12

2y

NYU Debate Training Update: Methods, Baselines, Preliminary Results

[This writeup reflects work done jointly with David Rein and Julian Michael at NYU's Alignment Research Group] Introduction In the past year, there have been a number of projects aimed at validating the basic premises behind debate as a mechanism for scalable oversight (see here, here, and here). One important...

Jul 6, 2024•9