Automated Sandwiching & Quantifying Human-LLM Cooperation: ScaleOversight hackathon results
by Esben Kran, fbarez, Sabrina Zaki, gabrielrecc, and rz2383
We ran a hackathon on scalable oversight with Gabriel Recchia as keynote speaker (watch the talk) and Ruiqi Zhong as co-judge. Here, we share the top projects and results. In summary: * We can automate the “sandwiching” paradigm from Cotra [1] by having a smaller model ask structured questions to...
Feb 23, 20238