This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
Tags
LW
$
Login
Scalable Oversight
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Scalable Oversight
Random Tag
Contributors
Posts tagged
Scalable Oversight
Most Relevant
2
145
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Ω
cloud
,
Jacob G-W
,
Evzen
,
Joseph Miller
,
TurnTrout
4d
Ω
6
2
107
Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight
Ω
Sam Marks
8mo
Ω
10
2
85
Scalable oversight as a quantitative rather than qualitative problem
Ω
Buck
5mo
Ω
11
2
31
Inference-Only Debate Experiments Using Math Problems
Ω
Arjun Panickssery
,
Abhimanyu Pallavi Sudhir
,
JacksonKaunismaa
4mo
Ω
0
2
21
AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
Ω
DanielFilan
4mo
Ω
0
1
49
On scalable oversight with weak LLMs judging strong LLMs
Ω
zac_kenton
,
Noah Siegel
,
janos
,
Jonah Brown-Cohen
,
Samuel Albanie
,
David Lindner
,
Rohin Shah
5mo
Ω
18
1
27
NYU Code Debates Update/Postmortem
David Rein
7mo
4
1
5
Reinforcement Learning from Information Bazaar Feedback, and other uses of information markets
Abhimanyu Pallavi Sudhir
3mo
1
1
1
Automated monitoring systems
hiki_t
12d
0