x

LESSWRONG

LW

denisemester — LessWrong

denisemester

denisemester

Message

6

1

1y

denisemester

6

1y

Arguing for the Truth? An Inference-Only Study into AI Debate

💡 TL;DR: Can AI debate be a reliable tool for truth-seeking? In this inference-only experiment (no fine-tuning), I tested whether Claude 3.5 Sonnet and Gemini 1.5 Pro could engage in structured debates over factual questions from BoolQ and MMLU datasets, with GPT-3.5 Turbo acting as an impartial judge. The findings...

Feb 11, 2025•7