I think this is a great goal, and I’m looking forward to what you put together!
This may be a bit different than the sort of thing you’re asking about, but I’d love to see more development/thought around topics related to https://www.lesswrong.com/posts/XqmjdBKa4ZaXJtNmf/raising-the-sanity-waterline .
Rationality is certainly a skill, and something better / more concise exposition on rationality itself can help people develop. But once you learn to think right, what are the some of the most salient object-level ideas that come next? How do we better realize values in the real world, and make make use of / propagate these better ways of thinking? Why is this so hard, and what are strategies to make it easier?
SSC/AXC is a great example of better exploring object-level ideas, and I’d love to see more of that type of work pulled back into the community.
What could a million perfectly-coordinated, tireless copies of a pretty smart, broadly skilled person running at 100x speed do in a couple years?
I this feels like the right analogy to consider.
And in considering this thought experiment, I'm not sure trying to solve alignment is the only/best way to reduce risks. This hypothetical seems open to reducing risk by 1) better understanding how to detect these actors operating at large scale 2) researching resilient plug-pulling strategies
Moreover, even if these things don't work that way and we get a slow takeoff, that doesn't necessarily save humanity. It just means that it will take a little longer for AI to be the dominant form of intelligence on the planet. That still sets a deadline to adequately solve alignment.
If a slow takeoff is all that's possible, doesn't that open up other options for saving humanity besides solving alignment?
I imagine far more humans will agree p(doom) is high if they see AI isn't aligned and it's growing to be the dominant form of intelligence that holds power. In a slow-takeoff, people should be able to realize this is happening, and effect non-alignment based solutions (like bombing compute infrastructure).
a superintelligence will be at least several orders of magnitude more persuasive than character.ai or Stuart Armstrong.
Believing this seems central to believing high P(doom).
But, I think it's not a coherent enough concept to justify believing it. Yes, some people are far more persuasive than others. But how can you extrapolate that far beyond the distribution we obverse in humans? I do think AI will prove to better than humans at this, and likely much better.
But "much" better isn't the same as "better enough to be effectively treated as magic".
This isn't where the community is supposed to have ended up. If rationality is systematized winning, then the community has failed to be rational.
Great post, and timely, for me personally. I found myself having similar thoughts recently, and this was a large part of why I recently decided to start engaging with the community more (so apologies for coming on strong in my first comment, while likely lacking good norms).
Some questions I'm trying to answer, and this post certainly helps a bit: