While the singularity doesn't have a reference class, benchmarks do have a reference class-- we have enough of them that we can fit reasonable distributions on when benchmarks will reach 50%, be saturated, etc., especially if we know the domain. The harder part is measuring superintelligence with benchmarks.
Do games between top engines typically end within 40 moves? It might be that an optimal player's occasional win against an almost-optimal player might come from deliberately extending and complicating the game to create chances
Does this meaningfully reduce the probability that you jump out of the way of a car or get screened for heart disease? The important thing isn't whether you have an emotional fear response, but how the behavior pattern of avoiding generalizes.
Much of my hope is that by the time we reach a superintelligence level where we need to instill reflectively endorsed values to optimize towards in a very hands-off way rather than just constitutions, behaviors, or goals, we'll have figured something else out. I'm not claiming the optimizer advantage alone is enough to be decisive in saving the world.
To the point about tighter feedback loops, I see the main benefit as being in conjunction with adapting to new problems. Suppose that we notice AIs take some bad but non-world-ending action like murdering people; then we can add a big dataset of situations in which AIs shouldn't murder people to the training data. If we were instead breeding animals, we would have to wait dozens of generations for mutations that reduce murder rate to appear and reach fixation. Since these mutations affect behavior through brain architecture, they would have a higher chance of deleterious effects. And if we're also selecting for intelligence, they would be competing against mutations that increase intelligence, producing a higher alignment tax. All this means that we have less chance to detect whether our proxies hold up (capabilities researchers have many of these advantages too, but the AGI would be able to automate capabilities training anyway).
If we expect problems to get worse at some rate until an accumulation of unsolved alignment issues culminates in disempowerment, it seems to me there is a large band of rates where we can stay ahead of them with AI training but evolution wouldn't be able to.
Noted. Somewhat surprised you believe in quantum immortality, is there a particular reason?
EJT's incomplete preferences proposal. But as far as I'm able to make out from the comments, you need to define a decision rule in addition to the utility function of an agent with incomplete preferences, and only some of those ways are compatible with shutdownability.
When I read it in school, the story frustrated me because I immediately wanted to create Omelas seeing as it's a thousand times better than our society, so I didn't really get the point of the intended and/or common interpretations.
Gradient descent (when applied to train AIs) allows much more fine-grained optimization than evolution, for these reasons:
It's unclear the degree to which these will solve inner alignment problems or cause AI goals to be more robust than animal goals to distributional shift, but we're in much better shape than evolution was.
If you disagree with much of IABIED but are still worried about AI risk, maybe the question to ask is "will the radical flank effect be positive or negative on mainstream AI safety movements?", which seems more useful than "do I on net agree or disagree?" or "will people taking this book at face value do useful or anti-useful things?" Here's what Wikipedia has to say on the sign of a radical flank effect:
It's difficult to tell without hindsight whether the radical flank of a movement will have positive or negative effects.[2] However, following are some factors that have been proposed as making positive effects more likely:
- Greater differentiation between moderates and radicals in the presence of a weak government.[2][13][14]: 411 As Charles Dobson puts it: "To secure their place, the new moderates have to denounce the actions of their extremist counterparts as irresponsible, immoral, and counterproductive. The most astute will quietly encourage 'responsible extremism' at the same time."[15]
- Existing momentum behind the cause. If change seems likely to happen anyway, then governments are more willing to accept moderate reforms in order to quell radicals.[2]
- Radicalism during the peak of activism, before concessions are won.[16] After the movement begins to decline, radical factions may damage the image of moderate organizations.[16]
- Low polarization. If there's high polarization with a strong opposing side, the opposing side can point to the radicals in order to hurt the moderates.[2]
Of course it's still useful to debate on which factual points the book is accurate, but making judgments of the book's overall value requires modeling other parts of the world.
Donated the max to both. I can believe there's more marginal impact for Bores, but on an emotional level, his proximity, YIMBY work, and higher probability of winning make me very excited about Wiener.