Definition: Living Homo sapiens will live on the surface of the planet Earth exactly 100 years from the publication of this post. This is similar to the Caplan-Yudkowsky bet except I extend the duration from 13 to 100 years and remove the reference to the cosmic endowment.
Confidence: >80%. I'd like to put it higher but there's a lot of unknowns and I'm being conservative.
I'd be happy to bet money at the Caplan-Yudokowsky bet's 2-1 odds on a 13-year horizon. However, I don't think their clever solution actually works. The only way Yudkowsky benefits from his loan is if he spends all of his resources before 2030, but if he does that then Caplan won't get paid back, even if Caplan wins the bet. Morever, even Yudkowsky doesn't seem to treat this as a serious bet.
While there's no good way to bet that the world will end soon (except indirectly via financial derivatives), there is a way to bet that our basic financial infrastructure will continue to exist for decades to come: I am investing in a Roth IRA.
Reasoning
Biowarfare won't kill everyone. Nuclear war won't kill everyone. Anthropogenic global warming won't kill everyone. At worst, these will destroy civilization which, counterintuitively, makes Homo sapiens more likely to survive on the short term (century). The same goes for minor natural disasters like volcanic eruptions.
Natural disasters like giant meteors, or perhaps a gamma ray burst, are unlikely. The last time something like that happened was 66 million years ago. The odds of something similar happening in the next century are on the order of . That's small enough to ignore for the purposes of this bet. The only way everyone dies is via AI.
I see two realistic roads to superintelligent world optimizers.
- Human simulator mesa optimizer running on a non-agentic superintelligence.
- Inhuman world-optimizing agent.
Human simulators are unlikely to exterminate humanity by accident because the agent mesa optimizer is (more or less) human aligned and the underlying superintelligence (currently LLMs) is not a world optimizer. The underlying superintelligence won't kill everyone (intentionally or unintentionally), because it's not a world optimizer. The emulated human probably won't kill everyone by accident because that would require a massive power differential combined with malevolent intent. Another way humanity gets exterminated in this scenario is as the side effect of futuretech war, but since neither Yudkowsky nor Caplan consider this possibility likely I don't feel I need to explain why it's unlikely.
Inhuman world-optimizing agents are unlikely to turn the Universe into paperclips because that's not the most likely failure mode. A world-optimizing agents must align its world model with reality. Poorly-aligned world-optimizing agents instrumentally converge, not on siezing control of reality, but on the much easier task of siezing competing pieces of their own mental infrastructure. A misaligned world optimizer that seeks to minimize conflict between its sensory data and internal world model will just turn off its sensors.
These ideas are not very fleshed out. I'm posting, not to explain my logic, but to publicly register a prediction. If you disagree (or agree), then please register a counter prediction in the comments.
There are several instances in the comments of the following schema: Someone finds lsusr's (self-admittedly not very fleshed out) arguments unsatisfactory and explains why; lsusr responds with just "Would you like to register a counter-prediction?".
I do not think this is a productive mode of discussion, and would like to see less of it. I think the fault is lsusr's.
These responses give me the very strong impression -- and I would bet fairly heavily that I am not alone in this -- that lsusr considers those comments inappropriate, and is trying to convey something along the lines of "put up or shut up; if you disagree with me then you should be able to make a concrete prediction that differs from mine; otherwise, no one should care about your arguments".
I see three possibilities.
First, maybe lsusr considers that (especially given the last paragraph in the OP) any discussion of the arguments at all is inappropriate. In that case, I think either (1) the arguments should simply not be in the OP at all or else (2) there should be an explicit statement along the lines of "I am not willing to discuss my reasoning and will ignore comments attempting to do so".
Second, maybe lsusr is happy to discuss the arguments, but only with people who demonstrate their seriousness by making a concrete prediction that differs from lsusr's. I think this would be a mistake, because it is perfectly possible to find something wrong in an argument without having a definite opinion on the proposition the argument is purporting to support. If someone says "2+2=5, therefore the COVID-19 pandemic was the result of a lab leak", one should not have to have formed a definite opinion on the origin of the pandemic in order to notice that 2+2 isn't 5 or that (aside from ex falso quodlibet) there is no particular connection between basic arithmetic and the origins of the pandemic.
(Also: some of those arguments may be of independent interest. lsusr's claim that poorly-aligned world-optimizing agents converge on "seizing competing pieces of their own mental infrastructure" sounds rather interesting if true -- and, for what it's worth, implausible to me prima facie -- regardless of how long the human race is going to survive.)
Third, maybe the impression I have is wrong, and lsusr doesn't in fact disapprove of, or dislike, comments addressing the arguments without a concrete disagreement with the conclusions. In that case, I think lsusr is giving a misleading impression by not engaging with any such comments in any way other than by posting the same "Would you like to register a counter-prediction?" response to all of them.
I think the most virtuous solution to your hypothetical is to say "I don't know anything about existential risk, but I'd bet at 75% confidence that a mathematician will prove that 2+2≠5" (or something along those lines).