Plan 'Straya: A Comprehensive Alignment Strategy Version 0.3 — DRAFT — Not For Distribution Outside The Pub Epistemic status: High confidence, low evidence. Consistent with community norms. Executive Summary Existing alignment proposals suffer from a shared flaw: they assume you can solve the control problem before the catastrophe. Plan 'Straya...
Cross posted from EA forum. Link: The theoretical computational limit of the Solar System is 1.47x10^49 bits per second. — EA Forum (effectivealtruism.org) Part 1 The limit is based on a computer operating at the Landauer Limit, at the temperature of the cosmic microwave background, powered by a Dyson sphere...
Aligned AGI is a large scale engineering task Humans have never completed at large scale engineering task without at least one mistake An AGI that has at least one mistake in its alignment model will be unaligned Given enough time, an unaligned AGI will perform an action that will negatively...
Technical alignment is hard Technical alignment will take 5+ years AI capabilities are currently subhuman in some areas (driving cars), about human in some areas (Bar exam), and superhuman in some areas (playing chess) Capabilities scale with compute The doubling time for AI compute is ~6 months In 5 years...