I feel like you can summarize most of this post in one paragraph:
It is not the case that an observation of things happening in the past automatically translates into a high probability of them continuing to happen. Solomonoff Induction actually operates over possible programs that generate our observation set (and in extension, the observable universe), and it may or not may not be the case that the simplest universe is such that any given trend persists into the future. There are no also easy rules that tell you when this happens; you just have to do the hard work of comparing world models.
I'm not sure the post says sufficiently many other things to justify its length.
If you already have the concept, you only need a pointer. If you don't have the concept, you need the whole construction. [1]
Related: Sazen and Wisdom Cannot Be Unzipped
I sometimes like things being said in a long way. Mostly that's just because it helps me stew on the ideas and look at them from different angles. But also, specifically, I liked the engagement with a bunch of epistemological intuitions and figuring out what can be recovered from them. I like in particular connecting the "trend continues" trend to the redoubtable "electron will weigh the same tomorrow" intuition.
(I realise you didn't claim there was nothing else in the dialogue, just not enough to justify the length)
I strongly emphasize with "I sometimes like things being said in a long way.", and am in general doubtful of comments like "I think this post can be summarized as [one paragraph]".
(The extreme caricature of this is "isn't your post just [one sentence description that strips off all nuance and rounds the post to the closest nearby cliche, completely missing the point, perhaps also mocking the author about complicating such a simple matter]", which I have encountered sometimes.)
Some of the most valuable blog posts I have read have been exactly of the form "write a long essay about a common-wisdom-ish thing, but really drill down on the details and look at the thing from multiple perspectives".
Some years back I read Scott Alexander's I Can Tolerate Anything Except The Outgroup. For context, I'm not from the US. I was very excited about the post and upon reading it hastily tried to explain it to my friends. I said something like "your outgroups are not who you think they are, in the US partisan biases are stronger than racial biases". The response I got?
"Yeah I mean the US partisan biases are really extreme.", in a tone implying that surely nothing like that affects us in [country I li...
I'll add that sometimes, there is a big difference between verbally agreeing with a short summary, even if it is accurate, and really understanding and appreciating it and its implications. That often requires long explanations with many examples and looking at the same issue from various angles. The two Scott Alexander posts you mentioned are a good example.
For even more brevity with no loss of substance:
A turkey gets fed every day, right up until it's slaughtered before Thanksgiving.
Nobody would understand that.
This sort of saying-things-directly doesn't usually work unless the other person feels the social obligation to parse what you're saying to the extent they can't run away from it.
Yeah, but I do actually think this paragraph is wrong on the existence of easy rules. It is a bit like saying: There are only the laws of fundamental physics, don't bother with trying to find high level laws, you just have to do the hard work of learning to apply fundamental physics when you are trying to understand a pendulum or a hot gas. Or biology.
Similarly, for induction there are actually easy rules applicable to certain domains of interest. Like Laplace's rule of succession, which assumes random i.i.d. sampling. Which implies the sample distribution tends to resemble the population distribution. The same assumption is made by supervised learning about the training distribution, which works very well in many cases. There are other examples like the Lindy effect (mentioned in another comment) and various popular models in statistics. Induction heads also come to mind.
Even if there is just one, complex, fully general method applicable to science or induction, there may still exist "easy" specialized methods, with applicability restricted to a certain domain.
I find this essay interesting as a case study in discourse and argumentation norms. Particularly as a case study of issues with discourse around AI risk.
When I first skimmed this essay when it came out, I thought it was ok, but mostly uninteresting or obvious. Then, on reading the comments and looking back at the body, I thought it did some pretty bad strawmanning.
I reread the essay yesterday and now I feel quite differently. Parts (i), (ii), and (iv) which don't directly talk about AI are actually great and many of the more subtle points are pretty well executed. The connection to AI risk in part (iii) is quite bad and notably degrades the essay as a whole. I think a well-executed connection to AI risk would have been good. Part (iii) seems likely to contribute to AI risk being problematically politicized and negatively polarized (e.g. low quality dunks and animosity). Further, I think this is characteristic of problems I have with the current AI risk discourse.
In parts (i), (ii), and (iv), it is mostly clear that the Spokesperson is an exaggerated straw person who doesn't correspond to any particular side of an issue. This seems like a reasonable rhetorical move to better explain...
Another way to put this is that posts should often discuss their limitations, particular when debunking bad arguments that are similar to more reasonable arguments.
I think discussing limitations clearly is a reasonable norm for scientific papers that reduces the extent to which people intentionally or unintentionally get away with implying their results prove more than they do.
What are the close-by arguments that are actually reasonable? Here is a list of close-by arguments (not necessarily endorsed by me!):
This part resonates with me; my experience in philosophy of science + talking to people unfamiliar with philosophy of science also led me to the same conclusion:
"You talk it out on the object level," said the Epistemologist. "You debate out how the world probably is. And you don't let anybody come forth with a claim that Epistemology means the conversation instantly ends in their favor."
"Wait, so your whole lesson is simply 'Shut up about epistemology'?" said the Scientist.
"If only it were that easy!" said the Epistemologist. "Most people don't even know when they're talking about epistemology, see? That's why we need Epistemologists -- to notice when somebody has started trying to invoke epistemology, and tell them to shut up and get back to the object level."
The main benefit of learning about philosophy is to protect you from bad philosophy. And there's a ton of bad philosophy done in the name of Empiricism, philosophy masquerading as science.
This scans as less "here's a helpful parable for thinking more clearly" and more "here's who to sneer at" -- namely, at AI optimists. Or "hopesters", as Eliezer recently called them, which I think is a play on "huckster" (and which accords with this essay analogizing optimists to Ponzi scheme scammers).
I am saddened (but unsurprised) to see few others decrying the obvious strawmen:
what if [the optimists] cried 'Unfalsifiable!' when we couldn't predict whether a phase shift would occur within the next two years exactly?
...
"But now imagine if -- like this Spokesperson here -- the AI-allowers cried 'Empiricism!', to try to convince you to do the blindly naive extrapolation from the raw data of 'Has it destroyed the world yet?' or 'Has it threatened humans? no not that time with Bing Sydney we're not counting that threat as credible'."
Thinly-veiled insults:
Nobody could possibly be foolish enough to reason from the apparently good behavior of AI models too dumb to fool us or scheme, to AI models smart enough to kill everyone; it wouldn't fly even as a parable, and would just be confusing as a metaphor.
and insinuations of bad faith:
...What if, when you tried to reason about why the mo
I don't think this essay is commenting on AI optimists in-general. It is commenting on some specific arguments that I have seen around, but I don't really see how it relates to the recent stuff that Quintin, Nora or you have been writing (and I would be reasonably surprised if Eliezer intended it to apply to that).
You can also leave it up to the reader to decide whether and when the analogy discussed here applies or not. I could spend a few hours digging up people engaging in reasoning really very closely to what is discussed in this article, though by default I am not going to.
Ideally Yudkowsky would have linked to the arguments he is commenting on. This would demonstrate that he is responding to real, prominent, serious arguments, and that he is not distorting those arguments. It would also have saved me some time.
But now imagine if -- like this Spokesperson here -- the AI-allowers cried 'Empiricism!', to try to convince you to do the blindly naive extrapolation from the raw data of 'Has it destroyed the world yet?'
The first hit I got searching for "AI risk empiricism" was Ignore the Doomers: Why AI marks a resurgence of empiricism. The second hit was AI Doom and David Hume: A Defence of Empiricism in AI Safety, which linked Anthropic's Core Views on AI Safety. These are hardly analogous to the Spokesman's claims of 100% risk-free returns.
Next I sampled several Don't Worry about the Vase AI newsletters and "some people are not so worried". I didn't really see any cases of blindly naive extrapolation from the raw data of 'Has AI destroyed the world yet?'. I found Alex Tabarrok saying "I want to see that the AI baby is dangerous before we strangle it in the crib.". I found Jacob Buckman saying "I'm Not Worried About An AI Apocalypse". These things are...
saddened (but unsurprised) to see few others decrying the obvious strawmen
In general, the "market" for criticism just doesn't seem very efficient at all! You might have hoped that people would mostly agree about what constitutes a flaw, critics would compete to find flaws in order to win status, and authors would learn not to write posts with flaws in them (in order to not lose status to the critics competing to point out flaws).
I wonder which part of the criticism market is failing: is it more that people don't agree about what constitutes a flaw, or that authors don't have enough of an incentive to care, or something else? We seem to end up with a lot of critics who specialize in detecting a specific kind of flaw ("needs examples" guy, "reward is not the optimization target" guy, "categories aren't arbitrary" guy, &c.), with very limited reaction from authors or imitation by other potential critics.
Apparently Eliezer decided to not take the time to read e.g. Quintin Pope's actual critiques, but he does have time to write a long chain of strawmen and smears-by-analogy.
A lot of Quintin Pope's critiques are just obviously wrong and lots of commenters were offering to help correct them. In such a case, it seems legitimate to me for a busy person to request that Quintin sorts out the problems together with the commenters before spending time on it. Even from the perspective of correcting and informing Eliezer, people can more effectively be corrected and informed if their attention is guided to the right place, with junk/distractions removed.
(Note: I mainly say this because I think the main point of the message you and Quintin are raising does not stand up to scrutiny, and so I mainly think the value the message can provide is in certain technical corrections that you don't emphasize as much, even if strictly speaking they are part of your message. If I thought the main point of your message stood up to scrutiny, I'd also think it would be Eliezer's job to realize it despite the inconvenience.)
I stand by pretty much everything I wrote in Objections, with the partial exception of the stuff about strawberry alignment, which I should probably rewrite at some point.
Also, Yudkowsky explained exactly how he'd prefer someone to engage with his position "To grapple with the intellectual content of my ideas, consider picking one item from "A List of Lethalities" and engaging with that.", which I pointed out I'd previously done in a post that literally quotes exactly one point from LoL and explains why it's wrong. I've gotten no response from him on that post, so it seems clear that Yudkowsky isn't running an optimal 'good discourse promoting' engagement policy.
I don't hold that against him, though. I personally hate arguing with people on this site.
Unless I'm greatly misremembering, you did pick out what you said was your strongest item from Lethalities, separately from this, and I responded to it. You'd just straightforwardly misunderstood my argument in that case, so it wasn't a long response, but I responded. Asking for a second try is one thing, but I don't think it's cool to act like you never picked out any one item or I never responded to it.
EDIT: I'm misremembering, it was Quintin's strongest point about the Bankless podcast. https://www.lesswrong.com/posts/wAczufCpMdaamF9fy/my-objections-to-we-re-all-gonna-die-with-eliezer-yudkowsky?commentId=cr54ivfjndn6dxraD
The basic issue though is that evolution doesn't have a purpose or goal
FWIW, I don't think this is the main issue with the evolution analogy. The main issue is that evolution faced a series of basically insurmountable, yet evolution-specific, challenges in successfully generalizing human 'value alignment' to the modern environment, such as the fact that optimization over the genome can only influence within lifetime value formation theough insanely unstable Rube Goldberg-esque mechanisms that rely on steps like "successfully zero-shot directing an organism's online learning processes through novel environments via reward shaping", or the fact that accumulated lifetime value learning is mostly reset with each successive generation without massive fixed corpuses of human text / RLHF supervisors to act as an anchor against value drift, or evolution having a massive optimization power overhang in the inner loop of its optimization process.
These issues fully explain away the 'misalignment' humans have with IGF and other intergenerational value instability. If we imagine a deep learning optimization process with an equivalent structure to evolution, then we could easily predi...
If Quintin hasn't yelled "Empiricism!" then it's not about him. This is more about (some) e/accs.
"Well, since it's too late there," said the Scientist, "would you maybe agree with me that 'eternal returns' is a prediction derived by looking at observations in a simple way, and then doing some pretty simple reasoning on it; and that's, like, cool? Even if that coolness is not the single overwhelming decisive factor in what to believe?"
"Depends exactly what you mean by 'cool'," said the Epistemologist.
"Okay, let me give it a shot," said the Scientist. "Suppose you model me as having a bunch of subagents who make trades on some kind of internal prediction market. The whole time I've been watching Ponzi Pyramid Incorporated, I've had a very simple and dumb internal trader who has been making a bunch of money betting that they will keep going up by 20%. Of course, my mind contains a whole range of other traders too, so this one isn't able to swing the market by itself, but what I mean by 'cool' is that this trader does have a bunch of money now! (More than others do, because in my internal prediction markets, simpler traders start off with more money.)"
"The problem," said the Epistemologist, "is that you're in an adversarial context, where the observations you're seeing have ...
As a direct result of reading this, I have changed my mind on an important, but private, decision.
The basic premise of this post is wrong, based on the strawman that an empiricist/scientist would only look at a single piece of information. You have the empiricist and scientists just looking at the returns on investment on bankmans scheme, and extrapolating blindly from there.
But an actual empiricist looks at all the empirical evidence. They can look the average rate of return of a typical investment, noting that this one is unusually high.They can learn how the economy works and figure out if there are any plausible mechanisms for this kind of economic returns. They can look up economic history, and note that Ponzi schemes are a thing that exists and happen reasonably often. From all the empirical evidence, the conclusion "this is a Ponzi scheme" is not particularly hard to arrive at.
Your "scientist" and "empricist" characters are neither scientists nor empiricists: they are blathering morons.
As for AI risk, you've successfully knocked down the very basic argument that AI must be safe because it hasn't destroyed us yet. But that is not the core of any skeptics argument that I know.
Instead, an actual empiricist skeptic might look at the actual empirical e...
Or:
“In the past, people who have offered such apparently-very-lucrative deals have usually been scammers, cheaters, and liars. And, in general, we have on many occasions observed people lying, scamming, cheating, etc. On the other hand, we have only very rarely seen such an apparently-very-lucrative deal turn out to actually be a good idea. Therefore, on the general principle that the future will be similar to the past, we predict a very high chance that Bernie is a cheating, lying scammer, and that this so-called ‘investment opportunity’ is fake.”
We thus defeat the Spokesperson’s argument on his own terms, without needing to get into abstractions or theory—and we do it in one paragraph.
This happens to also be precisely the correct approach to take in real life when faced with apparently-very-lucrative deals and investment opportunities (unless you have the time to carefully investigate, in great detail and with considerable diligence, all such deals that are offered to you).
Ah, but there is some non-empirical cognitive work done here that is really relevant, namely the choice of what equivalence class to put Bernie Bankman into when trying to forecast. In the dialogue, the empiricists use the equivalence class of Bankman in the past, while you propose using the equivalence class of all people that have offered apparently-very-lucrative deals.
And this choice is in general non-trivial, and requires abstractions and/or theory. (And the dismissal of this choice as trivial is my biggest gripe with folk-frequentism—what counts as a sample, and what doesn't?)
Only praise yourself as taking 'the outside view' if (1) there's only one defensible choice of reference class;
I think this point is underrated. The word "the" in "the outside view" is sometimes doing too much work, and it is often better to appeal to an outside view, or multiple outside views.
+4. In some regards it's sad to have to rehash this argument, but I feel that this argument has been going around in the public discourse, and so it's worthwhile to write up a thorough account of what's naive about it and how to move past it. My sense is that it has become less prevalent since the essay; perhaps the essay helped.
Many folks have distaste for Eliezer's style, or for perhaps implying that a weak-man argument is fully representative of positions he disagrees with; I think some of these criticisms are valid but do not mean the essay isn't (a) p...
Thank you for the concrete example of Unfalsifiable Stories of Doom from Barnett et al in November 2025 I think there are several important differences between the two arguments. To avoid taking up too much of our time, I'm going to dwell on one in particular.
The Spokesperson in Empiricism! is dismissive of the entire concept of predicting the future using "words words words and thinking". Barnett et al are not. I think this is clearest in their engagement with IABIED's claim that AIs steer in alien directions that only mostly coincide with helpfulness. Here's the claim:
Modern AIs are pretty helpful (or at least not harmful) to most users, most of the time. But as we noted above, a critical question is how to distinguish an AI that deeply wants to be helpful and do the right thing, from an AI with weirder and more complex drives that happen to line up with helpfulness under typical conditions, but which would prefer other conditions and outcomes even more. ... This long list of cases look just like what the “alien drives” theory predicts, in sharp contrast with the “it’s easy to make AIs nice” theory that labs are eager to put forward.
Th...
I suspect there is some merit to the Scientist's intuition (and the idea that constant returns are more "empirical") which nobody has managed to explain well. I'll try to explain it here.[1]
The Epistemologist's notion of simplicity is about short programs with unbounded runtime which perfectly explain all evidence. The [non-straw] empiricist notion of simplicity is about short programs with heavily-bounded runtime which approximately explain a subset of the evidence. The Epistemologist is right that there is nothing of value in the empiri...
You've got to think about what might be going on behind the scenes, in both cases.
But a tricky bit with AI is that it involves innovating fundamentally new ways of doing things. The methods we already have are not sufficient to create ASI, and also if you extrapolate out the SOTA methods at larger scale, it's genuinely not that dangerous. Rather with AI, we imagine that people will make up new things behind the scenes which is radically different from what we have so far, or that what we have so far will turn out to be much more powerful due to being radically different from how we understand it today.
I agree that this should be said, but there is also actual disagreement about which theory is better.
Getting reliable 20% returns every year is really quite amazingly hard.
Foundations for analogous arguments about future AI systems are not sufficiently understood - I mean, maybe we can get very capable system that optimise softly like current systems.
...And then the AI companies, if they’re allowed to keep selling those—we have now observed—just brute-RLHF their models into not talking about that. Which means we can’t get any trustworthy observations of
There does actually seem to be a simple and general rule of extrapolation that can be used when no other data is available: If a trend has so far held for some timespan t, it will continue to hold, in expectation, for another timespan t, and then break down.
In other words, if we ask ourselves how long an observed trend will continue to hold, it does seem, absent further data, a good indifference assumption to think that we are currently in the middle of the trend; that we have so far seen half of it.
Of course it is possible that we are currently near the b...
Fun read! I was surprised that the spokesperson kept up with the conversation as well as he did 🙂
Of course there's an Art of when to trust more in less complicated reasoning -- an Art of when to pay attention to data more narrowly in a domain and less to inferences from generalizations on data from wider domains
I would like to try and expand on what that Art is. The Spokesperson is offering up an inductive argument. For an inductive argument to be any good I believe it requires something like the following.
Which concept they might obtain by reading my book on Highly Advanced Epistemology 101 For Beginners, or maybe just my essay on Local Validity as a Key to Sanity and Civilization, I guess?"
Perhaps, there should be two links here?
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?