An underrated answer is that humans are very, very dependent on other people to survive, and we have easily the longest childhood where we are vulnerable of any mammal, and even once we do become an adult, we are still really, really bad at surviving on our own compared to other animals, and since we are K-selected, every dead child mattters a lot in evolution, so it's very, very difficult for sociopathy to be selected for.
(Crossposted from EAF)
Nice write-up on the issue.
One thing I will say is that I'm maybe unusually optimistic on power concentration compared to a lot of EAs/LWers, and the main divergence I have is that I basically treat this counter-argument as decisive enough to make me think the risk of power-concentration doesn't go through, even in scenarios where humanity is basically as careless as possible.
This is due to evidence on human utility functions showing that most people have diminishing returns on utility on exclusive goods to use personally that are fast enough that altruism matters much more than their selfish desires on stellar/galaxy wide scales, combined with me being a relatively big believer in quite a few risks like suffering risks being very cheap to solve via moral trade where most humans are apathetic on.
More generally, I've become mostly convinced of the idea that a crucial positive consideration on any post-AGI/ASI future is that it's really, really easy to prevent most of the worst things that can happen in those futures under a broad array of values, even if moral objectivism/moral realism is false and there isn't much convergence on values amongst the broad population.
Edit: I edited in a link.
whether to pursue an aggressive (stock-heavy) or conservative (bond-heavy) investment strategy. if there is an ai bubble pop, it will likely bring the entire economy into a recession.
This is my biggest disagreement at the moment, and the reason is unlike 2008 or 2020, there's no supply squeeze or financial consequences severe enough that banks start to fail, and I expect an AI bubble to look more like the 2000 bubble than the 2008 or 2020 bubbles/crises.
That said, AI stocks would fall hard and GPUs would become way, way cheaper.
Thanks, I'll edit the post to note I misinterpreted the paper.
Correct on that.
But it might nevertheless automate most jobs within a decade or so, and then continue churning along, automating new jobs as they come up.
I think this is less likely than I did a year ago, and a lot of this is informed by Steve Newman's blog post on a project not being a bundle of tasks.
My median expectation is we get 1-3 month 50% of tasks done by 2030, and 1 week 80% of tasks done by 2030, which under this view is not enough to automate away managers, and depending on how much benchmarks diverge from reality, may not even be enough to automate away most regular workers, and my biggest probable divergence is I don't expect super-exponential progress to come soon enough to bend these curves up, due to putting much less weight on superexponential progress within 5 years as a result of trend breaks than you.
Here's the link for a project is not a bundle of tasks.
I have nothing to say on the rest of your comment.
To be completely honest, this should not be voted by basically anyone in the review, and this was just a short reaction post that doesn't have enduring value.
I've come to increasingly think that being able to steelman positions, especially positions you don't hold is an extremely important skill to be effective at truth-finding, especially in the modern era, and that steelmanning is mostly normal for effectively finding the truth, rather than being an exceptional trait.
Not doing this is a lot of the reason why political discussions tend to end up so badly.
This is why I give this post a +4.
That said, there are 2 important caveats that limit the applicability of this principle.
My prediction for why LW has been less focused on core rationality content is in broad strokes because of the fact that AI has grown more in importance, and more generally one of the lessons rationalists have learned is that object-level practice in a skill (usually) has much less diminishing returns than meta-level thinking (which is yet another example of continual learning mattering a lot for human success).
I would analogize this to a human with anterograde amnesia, who cannot form new memories, and who is constantly writing notes to keep track of their life. The limitations here are obvious, and these are limitations future Claudes will probably share unless LLM memory/continual learning is solved in a better way.
This is an extremely underrated comparison, TBH. Indeed, I'd argue that frozen weights + lack of a long-term memory are easily one of the biggest reasons why LLMs are much more impressive than useful at a lot of tasks (with reliability being another big, independent issue).
It emphasizes 2 things that are both true at once: LLMs do in fact reason like humans and can have (poor-quality) world-models, and there's no fundamental chasm between LLM capabilities and human capabilities that can't be cured by unlimited resources/time, and yet just as humans with anterograde amnesia are usually much less employable/useful to others than people who do have long-term memory, current AIs are much, much less employable/useful than future paradigm AIs.
Link to long comments that I want to pin, but are too long to be pinned:
https://www.lesswrong.com/posts/Zzar6BWML555xSt6Z/?commentId=aDuYa3DL48TTLPsdJ
https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/?commentId=Gcigdmuje4EacwirD
https://www.lesswrong.com/posts/DCQ8GfzCqoBzgziew/?commentId=RhTNmgZqjJpzGGAaL