I'm being dramatic by calling it a dying world, but everyone's worried about the future. Climate change, running out of fossil fuels, social problems, death of current humans, etc. Likely not an actual extinction of humanity, but hard times for sure. AI going well could be the easy way out of all that, if it's actually as big as we think it might be. I think the accelerationists would not be as keen if the world was otherwise stable.
Another way of saying it is that our current reckless trajectory only makes sense if you view it as a gambit to stave off the rest of the things that are coming for us (which are mostly the fruits of our past and current recklessness). I'm sympathetic to the thought, even if it might also kill us.
Re: doomers and ethicists agreeing: The position that the authors of The AI Con take is that the doomers feed the mythos of AI as godlike by treating it as a doomsday machine. This almost-reverence fuels the ambitions and excitement of the accelerationists, while also reducing enthusiasm for tackling the more mundane challenges.
Yudkowsky still wants resources to go towards solving alignment, and if AI is a dud, that wouldn't be necessary. I view the potential animosity between ethicists and doomers as primarily a fight over attention/funding. Ethicists see themselves as being choked out by people working towards solving fictional problems, and that creates resentment and dismissal. And doomers do often think focusing on the mundane harms is a waste of time. Ideally the perspectives would be coherent/compatible, and finding that bridge, or at least holding space for both, is the aim of this post.
It’s not about comparing a process to universal ontology, it’s about comparing it to one’s internal model of the universal ontology, which we then hope is good enough. In the ethics dataset, that could look like reductio ad absurdum on certain model processes, e.g.: “You have a lot of fancy reasoning here for why you should kill an unspecified man on the street, but it must be wrong because it reaches the wrong conclusion.”
(Ethics is a bit of a weird example because the choices aren’t based around trying to infer missing information, as is paradigmatic of the personal/universal tension, but the dynamic is similar.)
Predicting the future 10,000 years hence has much less potential for this sort of reductio, of course. So I see your point. It seems like in such cases, humans can only provide feedback via comparison to our own learned forecasting strategies. But even this bears similar structure.
We can view the real environment that we learned our forecasting strategies from as the “toy model” that we are hoping will generalize well enough to the 10,000 year prediction problem. Then, the judgement we provide on the AI’s processes is the stand-in for actually running those processes in the toy model. Instead of seeing how well the AI’s methods do by simulating them in the toy model, we compare its methods to our own methods, which evolved due to success in the model.
Seeing things like this allows us to identify two distinct points of failure in the humans-judging-processes setup:
The forecasting environment humans learned in may not bear enough similarity to the 10,000 year forecasting problem.
Human judgement is just a lossy signal for actual performance on that environment they learned in; AI methods that would perform well in the human’s environment may still get rated poorly by humans, and vice versa.
So it seems to me that the general model of the post can understand these cases decently well, but the concepts are definitely a bit slippery and this is the area that I feel most uncertain about here.
Hey, thanks for the comment. Part of what I like about this framework is that it provides an account for how we do that process of “somehow judging things as true”. Namely, that we develop personal concepts that correspond with universal concepts via the various forces that change our minds over time.
We can’t access universal ontology ourselves, but reasoning about it allows us to state things precisely - it provides a theoretical standard for whether a process aimed at determining truth succeeds or not.
Do you have an example for domains where ground truth is unavailable, but humans can still make judgements about what processes are good to use? I’d claim that most such cases involve a thought experiment, i.e. a model about how the world works that implies a certain truth-finding method will be successful.
You're definitely challenging a key piece of my perspective here, and I've thought a good bit about how to respond. What I've come up with, is... I think all of us are involved. The labs don't exist in a vacuum, and the opinion of the public does have an impact. So I think looking at scopes of agency larger than the individual is a helpful thing to do.
In this piece I'm describing the choice that is getting made on behalf of humanity, from the lens of humanity. Because it really does affect all of us. But that's also why I take a hands-off kind of approach here, because it's not necessarily my role to say or know what I think humanity should be doing. I'm just an ignorant grain of sand.