This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Research Agendas
•
Applied to
How not to write the Cookbook of Doom?
by
brunoparga
5h
ago
•
Applied to
Does anyone's full-time job include reading and understanding all the most-promising formal AI alignment work?
by
NicholasKross
16h
ago
•
Applied to
Abstraction is Bigger than Natural Abstraction
by
NicholasKross
17d
ago
•
Applied to
My current research questions for «membranes/boundaries»
by
Chipmonk
19d
ago
•
Applied to
My AI Alignment Research Agenda and Threat Model, right now (May 2023)
by
Ruby
20d
ago
•
Applied to
Inference from a Mathematical Description of an Existing Alignment Research: a proposal for an outer alignment research program
by
Christopher King
21d
ago
•
Applied to
[Linkpost] Interpretability Dreams
by
DanielFilan
23d
ago
•
Applied to
(Slightly) Scalable RLHF Alternatives: A Productive Path for Slow Takeoff Worlds?
by
marc/er
1mo
ago
•
Applied to
Notes on the importance and implementation of safety-first cognitive architectures for AI
by
Brendon_Wong
1mo
ago
•
Applied to
Roadmap for a collaborative prototype of an Open Agency Architecture
by
Deger Turan
1mo
ago
•
Applied to
H-JEPA might be technically alignable in a modified form
by
Roman Leventov
1mo
ago
•
Applied to
Annotated reply to Bengio's "AI Scientists: Safe and Useful AI?"
by
Roman Leventov
1mo
ago
•
Applied to
Orthogonal's Formal-Goal Alignment theory of change
by
Tamsin Leake
1mo
ago
•
Applied to
Archetypal Transfer Learning and a Corrigibility-Friendly Optimization Technique
by
marc/er
1mo
ago
•
Applied to
Research agenda: Supervising AIs improving AIs
by
Quintin Pope
2mo
ago
•
Applied to
For alignment, we should simultaneously use multiple theories of cognition and value
by
Roman Leventov
2mo
ago
•
Applied to
Davidad's Bold Plan for Alignment: An In-Depth Explanation
by
Charbel-Raphaël
2mo
ago