LESSWRONGTags
LW

AI-assisted/AI automated Alignment

EditHistorySubscribe
Discussion (0)
Help improve this page (1 flag)
EditHistorySubscribe
Discussion (0)
Help improve this page (1 flag)
AI-assisted/AI automated Alignment
Random Tag
Contributors
6Ruby

Not obviously the best name for this tag, but maybe good to explore/rename. Wiki-tags are publicly editable!

Posts tagged AI-assisted/AI automated Alignment
Most Relevant
3
98Beliefs and Disagreements about Automating Alignment ResearchΩ
Ian McKenzie
5mo
Ω
4
2
142Godzilla StrategiesΩ
johnswentworth
8mo
Ω
65
2
52Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review)
Shoshannah Tekofsky
8d
6
2
46My thoughts on OpenAI's alignment plan
Akash
1mo
0
2
45What specific thing would you do with AI Alignment Research Assistant GPT?Q
quetzal_rainbow, janus
1mo
Q
9
2
43A survey of tool use and workflows in alignment researchΩ
Logan Riggs, Jan, janus, jacquesthibs
10mo
Ω
5
2
17Model-driven feedback could amplify alignment failuresΩ
aogara
6d
Ω
1
2
17[Linkpost] Jan Leike on three kinds of alignment taxes
Akash
1mo
1
2
16Discussion on utilizing AI for alignment
elifland
5mo
3
2
14Making it harder for an AGI to "trick" us, with STVsΩ
Tor Økland Barstad
7mo
Ω
5
2
9Eli Lifland on Navigating the AI Alignment Landscape
ozziegooen
3d
1
2
9Getting from an unaligned AGI to an aligned AGI? Ω
Tor Økland Barstad
7mo
Ω
7
2
8AI-assisted list of ten concrete alignment things to do right nowΩ
lcmgcd
5mo
Ω
5
2
8Sufficiently many Godzillas as an alignment strategy
142857
5mo
3
2
7Alignment with argument-networks and assessment-predictionsΩ
Tor Økland Barstad
2mo
Ω
3
Load More (15/27)
Add Posts