LESSWRONG
LW

samarnesen
12110
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
GPT-3.5 judges can supervise GPT-4o debaters in capability asymmetric debates
samarnesen11mo50

This seems like really interesting work!  Would you be able to share any example transcripts from some of these debates? Since RLHF'ed models often shy away from combativeness, I'm curious as to the form of GPT-4's rebuttals (especially for questions where the judge gets it right after reading the debate but wrong otherwise) 

Reply
9NYU Debate Training Update: Methods, Baselines, Preliminary Results
1y
0