This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Redwood Research
Edit
History
Discussion
(0)
Help improve this page
Edit
History
Discussion
(0)
Help improve this page
Redwood Research
Random Tag
Contributors
Posts tagged
Redwood Research
Most Relevant
7
189
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
Ω
LawrenceC
,
Adrià Garriga-alonso
,
Nicholas Goldowsky-Dill
,
ryan_greenblatt
,
jenny
,
Ansh Radhakrishnan
,
Buck
,
Nate Thomas
6mo
Ω
27
5
134
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
Ω
maxnadeau
,
Xander Davies
,
Buck
,
Nate Thomas
8mo
Ω
14
3
145
Redwood Research’s current project
Ω
Buck
2y
Ω
29
3
137
Takeaways from our robust injury classifier project [Redwood Research]
Ω
dmz
9mo
Ω
10
3
48
Redwood's Technique-Focused Epistemic Strategy
Ω
adamShimi
2y
Ω
1
3
16
AXRP Episode 17 - Training for Very High Reliability with Daniel Ziegler
Ω
DanielFilan
10mo
Ω
0
2
142
High-stakes alignment via adversarial training [Redwood Research report]
Ω
dmz
,
LawrenceC
,
Nate Thomas
1y
Ω
29
2
113
Why I'm excited about Redwood Research's current project
Ω
paulfchristiano
2y
Ω
6
2
35
Some common confusion about induction heads
Alexandre Variengien
3mo
4
2
13
[Linkpost] Critiques of Redwood Research
Akash
3mo
2
1
96
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
Ω
KevinRoWang
,
Alexandre Variengien
,
Arthur Conmy
,
Buck
,
jsteinhardt
8mo
Ω
7
1
82
Practical Pitfalls of Causal Scrubbing
Ω
Jérémy Scheurer
,
Phil3
,
tony
,
jacquesthibs
,
David Lindner
3mo
Ω
17
1
56
We're Redwood Research, we do applied alignment research, AMA
Ω
Nate Thomas
2y
Ω
2
1
50
Help out Redwood Research’s interpretability team by finding heuristics implemented by GPT-2 small
Haoxing Du
,
Buck
8mo
11
1
44
Redwood Research is hiring for several roles
Jack R
,
billzito
2y
0