This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Research Agendas
•
Applied to
What and Why: Developmental Interpretability of Reinforcement Learning
by
Ruby
18d
ago
•
Applied to
Labor Participation is a High-Priority AI Alignment Risk
by
alex
1mo
ago
•
Applied to
What should I do? (long term plan about starting an AI lab)
by
not_a_cat
2mo
ago
•
Applied to
What should AI safety be trying to achieve?
by
EuanMcLean
2mo
ago
•
Applied to
Announcing Human-aligned AI Summer School
by
Jan_Kulveit
2mo
ago
•
Applied to
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
by
scasper
2mo
ago
•
Applied to
The Prop-room and Stage Cognitive Architecture
by
Robert Kralisch
3mo
ago
•
Applied to
Speedrun ruiner research idea
by
lukehmiles
3mo
ago
•
Applied to
Constructability: Plainly-coded AGIs may be feasible in the near future
by
Charbel-Raphaël
4mo
ago
•
Applied to
Sparsify: A mechanistic interpretability research agenda
by
Marius Hobbhahn
4mo
ago
•
Applied to
Gradient Descent on the Human Brain
by
Jozdien
4mo
ago
•
Applied to
Towards White Box Deep Learning
by
Maciej Satkiewicz
4mo
ago
•
Applied to
Natural abstractions are observer-dependent: a conversation with John Wentworth
by
Martín Soto
5mo
ago
•
Applied to
Gaia Network: An Illustrated Primer
by
Rafael Kaufmann Nedal
6mo
ago
•
Applied to
Worrisome misunderstanding of the core issues with AI transition
by
Roman Leventov
6mo
ago