This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Research Agendas
•
Applied to
Research agenda: Formalizing abstractions of computations
by
Erik Jenner
at
5d
•
Applied to
Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
by
Evan R. Murphy
at
5d
•
Applied to
Selection Theorems: A Program For Understanding Agents
by
DragonGod
at
8d
•
Applied to
The AI Control Problem in a wider intellectual context
by
philosophybear
at
25d
•
Applied to
World-Model Interpretability Is All We Need
by
Thane Ruthenis
at
1mo
•
Applied to
Trying to isolate objectives: approaches toward high-level interpretability
by
Jozdien
at
1mo
•
Applied to
Research ideas (AI Interpretability & Neurosciences) for a 2-months project
by
flux
at
1mo
•
Applied to
Announcing: The Independent AI Safety Registry
by
Shoshannah Tekofsky
at
1mo
•
Applied to
An overview of some promising work by junior alignment researchers
by
Akash
at
1mo
•
Applied to
Towards Hodge-podge Alignment
by
Cleo Nardo
at
2mo
•
Applied to
My AGI safety research—2022 review, ’23 plans
by
Steven Byrnes
at
2mo
•
Applied to
Theories of impact for Science of Deep Learning
by
Marius Hobbhahn
at
2mo
•
Applied to
My summary of “Pragmatic AI Safety”
by
Eleni Angelou
at
3mo
•
Applied to
All life's helpers' beliefs
by
Tehdastehdas
at
3mo
•
Applied to
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
by
Rudi C
at
3mo
•
Applied to
AI researchers announce NeuroAI agenda
by
Cameron Berg
at
3mo
•
Applied to
Distilled Representations Research Agenda
by
Ruby
at
4mo