2022 MIRI Alignment Discussion

LESSWRONG
LW

2284

LESSWRONG
LW

2284

Jun 15, 2022 by Rob Bensinger

A collection of MIRI write-ups and conversations about alignment released in 2022, following the Late 2021 MIRI Conversations.

317Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

956AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

711

276A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

316On how various plans miss the hard bits of the alignment challenge

So8res

173The inordinately slow spread of good AGI conversations in ML

Rob Bensinger

198A note about differential technological development

So8res

136Brainstorm of things that could force an AI team to burn their lead

So8res

177AGI ruin scenarios are likely (and disjunctive)

So8res

65Where I currently disagree with Ryan Greenblatt’s version of the ELK approach

So8res

159Why all the fuss about recursive self-improvement?

So8res

50Humans aren't fitness maximizers

So8res

130Warning Shots Probably Wouldn't Change The Picture Much

So8res

74What does it mean for an AGI to be 'safe'?

So8res

136Don't leave your fingerprints on the future

So8res

134Niceness is unnatural

So8res

105Contra shard theory, in the context of the diamond maximizer problem

So8res

64Notes on "Can you control the past"

So8res

171Decision theory does not imply that we get to have nice things

So8res

134Superintelligent AI is necessary for an amazing future, but far from sufficient

So8res

112How could we know that an AGI system will have good consequences?

So8res

72Distinguishing test from training

So8res

303A challenge for AGI organizations, and a challenge for readers

Rob Bensinger, Eliezer Yudkowsky

102Thoughts on AGI organizations and capabilities work

Rob Bensinger, So8res