x

LESSWRONG

LW

Vassil Tashev — LessWrong

Vassil Tashev

Vassil Tashev

Message

Independent researcher | AI Safety Camp 2024, Team 22

10

1

2y

Vassil Tashev

Independent researcher | AI Safety Camp 2024, Team 22

Paper review: “The Unreasonable Effectiveness of Easy Training Data for Hard Tasks”

TL;DR: Scalable oversight seems easier based on experiments outlined in a recent paper; questions arise about the implications of these findings. The following graciously provided feedback and advice on the draft, for which I am deeply grateful (in alphabetical order): Sawyer Bernath, Samuel R. Bowman, Bogdan-Ionut Cirstea, Severin Field, Peter...

Feb 29, 2024•11