LESSWRONG
LW

626
ML Alignment Theory Scholars Program Winter 2021

ML Alignment Theory Scholars Program Winter 2021

Dec 08, 2021 by evhub

In the past six weeks, the Stanford Existential Risks Initiative (SERI) has been running a work trial for the “ML Alignment Theory Scholars” (MATS) program. Our goal is to increase the number of people working on alignment theory, and to do this, we’re running a scholars program that provides mentorship, funding, and community to promising new alignment theorists. This program is run in partnership with Evan Hubinger, who has been providing all of the mentorship to each of the scholars for their work trial.

As the final phase of the work trial, each participant has taken a previous research artifact (usually an Alignment Forum post) and written a distillation and expansion of that post. The posts were picked by Evan and each participant signed up for one they were interested in. Within the next two weeks (12/7 - 12/17), we’ll be posting all of these posts to lesswrong and the alignment forum as part of a sequence, with a couple of posts going up each day. (There will be around 10-15 posts total.)

82ML Alignment Theory Program under Evan Hubinger
Ω
ozhang, evhub, Victor W
4y
Ω
3
66Theoretical Neuroscience For Alignment Theory
Ω
Cameron Berg
4y
Ω
18
27Introduction to inaccessible information
Ω
Ryan Kidd
4y
Ω
6
41Understanding Gradient Hacking
Ω
peterbarnett
4y
Ω
5
33Understanding and controlling auto-induced distributional shift
Ω
L Rudolf L
4y
Ω
4
40The Natural Abstraction Hypothesis: Implications and Evidence
Ω
CallumMcDougall
4y
Ω
9
14Should we rely on the speed prior for safety?
Ω
Marc Carauleanu
4y
Ω
5
16Motivations, Natural Selection, and Curriculum Engineering
Ω
Oliver Sourbut
4y
Ω
0
10Universality and the “Filter”
Ω
maggiehayes
4y
Ω
2
22Evidence Sets: Towards Inductive-Biases based Analysis of Prosaic AGI
Ω
bayesian_kitten
4y
Ω
10
20Disentangling Perspectives On Strategy-Stealing in AI Safety
Ω
shawnghu
4y
Ω
1
14Don't Influence the Influencers!
Ω
lhc
4y
Ω
2