The weaker version of pure reward optimization in humans is basically just the obesity issue, and the biggest reason humans became more obese on a population level in the 20th and 21st century is because we have figured out how to goodhart human reward models in the domain of food such that sugary, fatty and high-calorie foods (with the high-calorie part being most important for obesity) are very, very highly rewarding to at least part of our brains.
Essentially everything has gotten more palatable and rewarding to eat than pre-20th century food.
And as you say, part of the issue is that drugs are very crippling for capabilities, whereas reward functions for AIs that are optimized will have much less of this issue, and optimizing the reward function likely makes AIs more capable.
An underrated answer is that humans are very, very dependent on other people to survive, and we have easily the longest childhood where we are vulnerable of any mammal, and even once we do become an adult, we are still really, really bad at surviving on our own compared to other animals, and since we are K-selected, every dead child mattters a lot in evolution, so it's very, very difficult for sociopathy to be selected for.
(Crossposted from EAF)
Nice write-up on the issue.
One thing I will say is that I'm maybe unusually optimistic on power concentration compared to a lot of EAs/LWers, and the main divergence I have is that I basically treat this counter-argument as decisive enough to make me think the risk of power-concentration doesn't go through, even in scenarios where humanity is basically as careless as possible.
This is due to evidence on human utility functions showing that most people have diminishing returns on utility on exclusive goods to use personally that are fast enough that altruism matters much more than their selfish desires on stellar/galaxy wide scales, combined with me being a relatively big believer in quite a few risks like suffering risks being very cheap to solve via moral trade where most humans are apathetic on.
More generally, I've become mostly convinced of the idea that a crucial positive consideration on any post-AGI/ASI future is that it's really, really easy to prevent most of the worst things that can happen in those futures under a broad array of values, even if moral objectivism/moral realism is false and there isn't much convergence on values amongst the broad population.
Edit: I edited in a link.
whether to pursue an aggressive (stock-heavy) or conservative (bond-heavy) investment strategy. if there is an ai bubble pop, it will likely bring the entire economy into a recession.
This is my biggest disagreement at the moment, and the reason is unlike 2008 or 2020, there's no supply squeeze or financial consequences severe enough that banks start to fail, and I expect an AI bubble to look more like the 2000 bubble than the 2008 or 2020 bubbles/crises.
That said, AI stocks would fall hard and GPUs would become way, way cheaper.
Thanks, I'll edit the post to note I misinterpreted the paper.
Correct on that.
But it might nevertheless automate most jobs within a decade or so, and then continue churning along, automating new jobs as they come up.
I think this is less likely than I did a year ago, and a lot of this is informed by Steve Newman's blog post on a project not being a bundle of tasks.
My median expectation is we get 1-3 month 50% of tasks done by 2030, and 1 week 80% of tasks done by 2030, which under this view is not enough to automate away managers, and depending on how much benchmarks diverge from reality, may not even be enough to automate away most regular workers, and my biggest probable divergence is I don't expect super-exponential progress to come soon enough to bend these curves up, due to putting much less weight on superexponential progress within 5 years as a result of trend breaks than you.
Here's the link for a project is not a bundle of tasks.
I have nothing to say on the rest of your comment.
To be completely honest, this should not be voted by basically anyone in the review, and this was just a short reaction post that doesn't have enduring value.
I've come to increasingly think that being able to steelman positions, especially positions you don't hold is an extremely important skill to be effective at truth-finding, especially in the modern era, and that steelmanning is mostly normal for effectively finding the truth, rather than being an exceptional trait.
Not doing this is a lot of the reason why political discussions tend to end up so badly.
This is why I give this post a +4.
That said, there are 2 important caveats that limit the applicability of this principle.
My prediction for why LW has been less focused on core rationality content is in broad strokes because of the fact that AI has grown more in importance, and more generally one of the lessons rationalists have learned is that object-level practice in a skill (usually) has much less diminishing returns than meta-level thinking (which is yet another example of continual learning mattering a lot for human success).
Link to long comments that I want to pin, but are too long to be pinned:
https://www.lesswrong.com/posts/Zzar6BWML555xSt6Z/?commentId=aDuYa3DL48TTLPsdJ
https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/?commentId=Gcigdmuje4EacwirD
https://www.lesswrong.com/posts/DCQ8GfzCqoBzgziew/?commentId=RhTNmgZqjJpzGGAaL