LESSWRONG
LW

danieldewey
489Ω685640
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
[Link] Why I’m optimistic about OpenAI’s alignment approach
danieldewey3yΩ440

This caused me to find your substack! Sorry I missed it earlier, looking forward to catching up.

Reply
Finding gliders in the game of life
danieldewey3yΩ5102

FWIW, I found the Strawberry Appendix especially helpful for understanding how this approach to ELK could solve (some form of) outer alignment.

Other readers, consider looking at the appendix even if you don't feel like you fully understand the main body of the post!

Reply
What does it take to defend the world against out-of-control AGIs?
danieldewey3yΩ7125

Nice post! I see where you're coming from here.

(ETA: I think what I'm saying here is basically "3.5.3 and 3.5.4 seem to me like they deserve more consideration, at least as backup plans -- I think they're less crazy than you make them sound." So I don't think you missed these strategies, just that maybe we disagree about how crazy they look.)

I haven't thought this through all the way yet, and don't necessarily endorse these strategies without more thought, but: 

It seems like there could be a category of strategies for players with "good" AGIs to prepare to salvage some long-term value when/if a war with "bad" AGIs does actually break out, because the Overton window will stop being relevant at that point. This prep might be doable without breaking what we normally think of as Overton windows*, and could salvage a percentage of the future light-cone, but would come at the cost of not preventing a huge war/catastrophe, and could cost a big percentage of the future light-cone (depending how "winnable" a war is from what starting points).

For example, a team could create a bunker that is well-positioned to be defended; or get as much control of civilization's resources as Overton allows and prepare plans to mobilize and expand into a war footing if "bad" AGI emerges; or prepare to launch von Neumann probes. Within the bubble of resources the "good" AGI controls legitimately before the war starts, the AGI might be able to build up a proprietary or stealthy technological lead over the rest of the world, effectively stockpiling its own supply of energy to make up for the fact that it's not consuming the free energy that it doesn't legitimately own.

Mnemonically, this strategy is something like "In case of emergency, break Overton window" :) I don't think your post really addresses these kinds of strategies, but very possible that I missed it (in which case my apologies).

*(We could argue that there's an Overton window that says "if there's a global catastrophe coming, it's unthinkable to just prepare to salvage some value, you must act to stop it!", which is why "prepare a bunker" is seen as nasty and antisocial. But that seems to be getting close to a situation where multiple Overton maxims conflict and no norm-following behavior is possible :) )

Reply
Takeaways from our robust injury classifier project [Redwood Research]
danieldewey3yΩ366

Thanks for the post, I found it helpful! the "competent catastrophes" direction sounds particularly interesting.

Reply
Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers
danieldewey4yΩ10120

This is extremely cool -- thank you, Peter and Owen! I haven't read most of it yet, let alone the papers, but I have high hopes that this will be a useful resource for me.

Reply
Against GDP as a metric for timelines and takeoff speeds
danieldewey5yΩ240

It didn't bug me ¯\_(ツ)_/¯

Reply
Against GDP as a metric for timelines and takeoff speeds
danieldewey5yΩ480

Thanks for the post! FWIW, I found this quote particularly useful:

Well, on my reading of history, that means that all sorts of crazy things will be happening, analogous to the colonialist conquests and their accompanying reshaping of the world economy, before GWP growth noticeably accelerates!

The fact that it showed up right before an eye-catching image probably helped :)

Reply
Debate update: Obfuscated arguments problem
danieldewey5yΩ6120

This may be out-of-scope for the writeup, but I would love to get more detail on how this might be an important problem for IDA.

Reply
Debate update: Obfuscated arguments problem
danieldewey5yΩ7120

Thanks for the writeup! This google doc (linked near "raised this general problem" above) appears to be private: https://docs.google.com/document/u/1/d/1vJhrol4t4OwDLK8R8jLjZb8pbUg85ELWlgjBqcoS6gs/edit

Reply
Verification and Transparency
danieldewey6yΩ110

This seems like a useful lens -- thanks for taking the time to post it!

Reply
Load More
No wikitag contributions to display.
43My understanding of the alignment problem
Ω
4y
Ω
3
0World-models containing self-models
Ω
9y
Ω
6
3Request for comments: introductory research guide
Ω
10y
Ω
0
35Request for proposals for Musk/FLI grants
10y
11
78The Future of Humanity Institute could make use of your money
11y
25
36Polymath-style attack on the Parliamentary Model for moral uncertainty
11y
74