(sorry, haven't read most of the stuff you link, but thought this would be positive EV to say anyway)
Rather, we need to argue that A has higher “expected value” broadly speaking, meaning: In some sense we “expect” that, if we were idealized agents who could aggregate all of A’s and B’s possible consequences into literal EVs, then we’d say A has higher EV.
I don't think we should accept this meaning of the EV of an action / this operationalization of what it is to do EV-thinking. it's like how when I calculate 2+2, I'm not calculating what I would conclude if I were to try to figure out what 2+2 is, I'm just calculating what 2+2 is! i think you're confusing truth and provability
like, when i make probability or expected value claims in general, from the outside one could look at me and say i'm just playing some game involving relating numbers to other numbers and betting attitudes and whatever, and maybe explain why i'm playing this game via eg jeffrey-bolker. but if from inside the system, i were to view it as legitimate to translate probability claims into some concrete claims about what sort of game i'm playing, then that would introduce all sorts of crazy stuff such as thinking that if i were to bet at 0.5 on P then that would make the probability of P 0.5. this is crazy — probabilities are supposed to be objective things from the inside, not things that can be changed by your attitudes.
(in particular, there is an important sense in which moral antirealism is false, just like truth being provability is false.)
see yudkowsky's metaethics sequence for a more detailed version of this argument. i also recommend the book gödel without tears in case you're not already familiar with the incompleteness phenomenon (including löb's thm)
(i guess one could describe this as me rejecting your P1, but it feels more like i think you are saying stuff in a confused frame)
setting aside the above objection to this genre of imo confused antirealism, the following example still seems extremely bad for the specific version of the view behind your premises, though maybe i'm misunderstanding the view:
(You can try to fix this by speaking of your expected value of your ideal guy's expected value of the options, not of what you expect the guy to decide, but at that point maybe it's clear that you're allowed to just be doing stuff with the expected value of the options directly?)
You can try to fix this by speaking of your expected value of your ideal guy's expected value of the options, not of what you expect the guy to decide
Ah, that's exactly what I meant — if we ourselves have precise, literal expected values about the ideal guy's expected values. But in P1 I don't want to assume we do. That's why I talk about scare-quote "expectations". I want to capture "whatever kind of aggregation across possible outcomes is EV-ish but is actually accessible to bounded agents". (This is vague, but as I say in the footnote, it's what EA consequentialists seem to actually appeal to in practice.)
And then, P1 says that in order for a c-preference to be justified, you need to "expect" that the literal EV you'd calculate if you were capable of doing so is positive. Does that clarify things?
Pretty interesting.
I think it'll be useful to compare to other cases of arbitrariness. E.g. what prior you start with is arbitrary, so all odds you give are scaled by some arbitrary factor. But this doesn't necessarily make them meaningless, or mean that you shouldn't make any decisions based on probabilistic considerations.
I'd break down the arguments about priors into ones based on properties, performance, and emotional recalibration.
Properties would be something like Savage's axioms - if you find them appealing, then you want to make decisions in a way compatible with probabilistic reasoning. So what if there's a remaining degree of freedom for the prior - Savage's theorem still holds, so even as you complain that you have no justification for one prior over another, you should still be acting as if you have one.
Performance is something like Dutch Book arguments, or, more powerfully, Solomonoff's arguments that a simplicity prior will make only a finite number of mistakes in an approximately computable universe. You build an abstract model of how a reasoning style will perform, and then you justify using that reasoning style by appealing to good modeled performance.
The emotional category is based on the idea that we weight arbitrariness "too heavily" in some sense, and that we need to change our emotional outlook on it. In the case of probabilities, it's important that the arbitrary component is contained to the prior - few people want to accept total arbitrariness. But when the arbitrariness is contained, it's more appealing to say something like "This arbitrariness is unavoidable, and that's okay. To worry that we're making a 'wrong choice' or that any choice here needs a further step of justification is to misunderstand what's going on. This is about expressing ourselves and doing our best, and it's genuinely okay to be arbitrary in this way."
If I was giving my own framing of the unawareness problem (I'm not a big fan of setting up P2 in terms of a black-and-white transition from "has an argument" to "doesn't have an argument"), I'd probably set it up in terms of the choices and simplifying assumptions we must use to arrive at a model of the far future that our limited minds can actually use to make decisions. How do our modeling choices have to be arbitrary or otherwise unjustified?
(I wrote this post partly to help orient those interested in participating in the EA Forum’s Cluelessness Critiques Competition. The competition closes August 14th.)
I’d like to elicit direct, productive critiques of the argument for cluelessness from my sequence on “unawareness”, which I’ll call the unawareness argument.
To that end, this post will:
Argument breakdown
Here’s a new framing of the unawareness argument (compared to how I present it in the sequence). I expect this framing to help readers disentangle different types of disagreements they might have, corresponding to three different premises of the argument.
Roughly: What would justify preferring action A over B on impartial altruistic grounds? We’d need to “expect” that according to our epistemically idealized self, A has better expected total consequences across the cosmos (normative premise). But if our understanding of these actions’ consequences is too coarse, then we can’t say how our idealized self would compare their expected values (conceptual premise). And our understanding of any given action’s cosmos-wide consequences is in fact that coarse (empirical premise). So there’s no impartial altruistic justification for preferring any action over another.
More precisely:
(See Appendix for how each section of the sequence maps onto this form of the argument.)
Why cluelessness matters
Here’s a natural reaction to this argument: “If we’re clueless, nothing we do matters anyway. So there’s no point looking into the argument, and we should just act as if we aren’t clueless.”
But I don’t think it’s that simple. I’ll explain why, then say what kinds of “looking into the argument” I expect to have the highest value of information.
First, the unawareness argument doesn’t imply that “nothing we do matters” all things considered. It only implies that impartial altruism, or any very far-reaching value system, isn’t action-guiding. Other values and moral norms still matter to us, for example, rules like avoiding dishonesty or virtues like compassion. These can be action-guiding even if we’re clueless about total consequences.
Second, if you think the argument goes wrong somewhere, it makes a difference where it goes wrong: [5]
Third, relatedly, unawareness probably has some implications for impartial altruists, even if we don’t think it makes us clueless. There’s older work on crucial considerations, but I don’t think such work has rigorously fleshed out the implications much, relative to the scale of the problem. As an example of such an implication: Suppose we want to make forecasts about post-ASI civilization, but we’re worried that forecasting methods that worked well in better understood systems won’t generalize to this case. We could study various methods in domains with (e.g.) different frequencies of past crucial consideration discovery, and see which methods are relatively robust as the frequency of crucial considerations increases. [6]
Finally, if nothing else, it seems epistemically virtuous to be clear about the reasons for our decisions. Sure, perhaps there’s no behavioral difference between “I’m working on AI risk because I’ve really weighed up all the possible consequences, and it doesn’t seem arbitrary to say this work is impartially good ‘in expectation’”, and “I have no clue if my idealized self would favor working on AI risk, but I’m doing it because no one has offered something better”. But I think if we’re honest with ourselves that our reasoning is the latter, we’ll have more open minds if and when “something better” comes along.
Critiques: What I expect to be productive, and what’s been said so far
So, if the unawareness argument is worth engaging with, what should we focus on when scrutinizing it? Going forward, I expect newer or sharper critiques of the normative premise, and part (b) of the conceptual premise, to be most productive. This is because:
As far as I’m aware, almost all existing critiques of the unawareness argument are addressed by the sequence, or by other references in this resource guide — see the following table. (The exception, included in the table as well, is the critique that incomparability violates decision-theoretic deference principles.)
(Edit June 27, 2026:) Several objections in the table are classified as objections to P1, because they’re of the form: “We should always have a c-preference (rather than say the options are incomparable w.r.t. c-preferability) because of [some reason other than a comparison of the ‘expected’ consequences].”
Acknowledgments
Thanks to Toby Tremlett, Clare Harris, and Konrad Kozaczek for comments.
Appendix: Sequence summary annotated with the corresponding premises
Here’s a copy of the unawareness sequence summary from post #1, where each section is tagged with the premises of the argument supported by that section.
1. The challenge of unawareness for impartial altruist action guidance: Introduction:
2. Why intuitive comparisons of large-scale impact are unjustified:
3. Why impartial altruists should suspend judgment under unawareness:
4. Why existing approaches to cause prioritization are not robust to unawareness:
Cf. “c-betterness” from Greaves (2016). ↩︎
In other words, we need to appeal to something like EV — our beliefs about the (perhaps imprecise) EV our idealized self would compute — to give an impartial altruistic justification for some choice. In his post on “Ideal Reflection”, Clifton discusses a similar idea: “‘The’ expected value is the expected value that would be assigned by an agent that has the same evidence as us, but is a vastly more powerful reasoner. Something like a perfect Bayesian who can reason over a ~maximally granular and exhaustive set of hypotheses, and has seen everything we’ve seen.” Clifton also notes, and I agree, that it’s not clear what exactly it means for us to have “expectations”-in-scare-quotes about our idealized self’s EVs; see his footnote 1. But as far as I can tell, this vague notion of “expectation” captures the kind of aggregation of possible consequences that impartial altruists aspire to, and that EAs typically appeal to. ↩︎
In the unawareness sequence, this claim largely maps onto the claim that we should represent actions’ “EV” imprecisely, to some degree. (And then, the remaining question is whether the degree of imprecision is so severe that all actions are incomparable.) But the argument doesn’t rely on the particular formal model of imprecise probabilities, or sets of expected values. ↩︎
“Empirical” in the sense that P3 is largely about contingent facts of our actual epistemic situation. But it isn’t purely empirical, since whether you accept P3 depends on your views on (e.g.) what these facts imply about the degree of imprecision of actions’ “EVs”. ↩︎
There’s an analogy to the philosophy literature on skepticism, and Agrippa’s trilemma: Even if you think the global skeptic is obviously wrong, it matters a lot whether your alternative to global skepticism is foundationalism, coherentism, or infinitism. ↩︎
See also Violet Hour’s call for forecasting generalizability research. ↩︎
Quote: “But from another perspective, every decision in life involves a “bet” of sorts on which action to take. The best available action may involve keeping your options open, delaying decisions, and gathering more information. But even those choices are still “part of the bet”. At the end of the day, you still have to choose an action. Humans can’t generate precise credences. … But when it comes time to act, we still have to cash out our uncertainty.” ↩︎
Quote: “Rejecting premise 1, completeness is essentially a nonstarter in the context of morality, where the whole project is premised on figuring out which worlds, actions, beliefs, rules, etc., are better than or equivalent to others. You can deny this your heart of hearts - I won’t say that you literally cannot believe that two things are fundamentally incomparable - but I will say that the world never accommodates your sincerely held belief or conscientious objector petition when it confronts you with the choice to take option A, option B, or perhaps coin flip between them.” ↩︎
Someone could agree that an appeal to heuristics by itself can’t justify a c-preference, but argue that some heuristic does indeed track the “expected” consequences. The sequence addresses this argument in Sec. 4.1.4. ↩︎
Quote: “My principal interest is the pragmatic one: that agents like ourselves make better decisions by attempting to EV-maximization with precisification than they would with imprecise approaches.” ↩︎
Quote: “If the argument from cluelessness depends on giving that kind of special status to imprecise credences, then I just reject them for the general reason that coarsening credences leads to worse decisions and predictions.” ↩︎
For example, suppose the standard is: “Compare how much utility we achieve on average over a set of decision problems, when we follow different procedures (one of which is ‘adopt some precise credences, then explicitly maximize EV’). The best decision is one that adheres to the best-performing procedure, by this metric.” This reduces to: “The best decision is one that adheres to a procedure that maximizes utility in expectation over some precise distribution over past decision problems.” One could give an independent motivation for privileging such a distribution over decision problems — in particular, argue that our beliefs about our current decision problem should precisely match the frequencies of problems in some reference class. But then we’re just back to debating the merits of precise beliefs themselves. ↩︎
As discussed in “How to not do decision theory backwards”, section “Objections and responses”, this view doesn’t assume a foundationalist view of justification. Nor does it deny that all intuitions can provide defeasible justification. ↩︎
Quote: “I think most of us feel like we’re really just making up arbitrary numbers, but that’s really uncomfortable because precisely which arbitrary numbers we make up seems to make a difference to what we ended up doing.” See also Greaves’s discussion of the “decision discomfort” involved in complex cluelessness. ↩︎
Quote: “Now, I agree that this scenario is ridiculous. And that it sucks. And I agree that picking a precise minute feels uncomfortable. And I agree that this is demanding way more precision than you are able to generate. But if you find yourself in the game, you’d best pick the minute as well as you can. When the gun is pressed against your temple, you cash out your credences.” ↩︎
If this claim is about normative standards, why do I classify it as a critique of the conceptual premise? Because I think the root of the critique is a conceptual misunderstanding, namely, of the structure of the arguments against precision. ↩︎
I haven’t seen anyone make “better than chance” precise, but this seems to me to be what people have in mind when they say this. ↩︎
Quote: “In the same way our track record of better-than-chance performance warrants us to believe our guesses on hard geopolitical forecasts, it also warrants us to believe a similar cognitive process will give ‘better than nothing’ guesses on which actions tend to be better than others, as the challenges are similar between both.” ↩︎
If we claim that we have better-than-chance intuitions about how to weigh up (i) vs. (ii), the same problem recurs. In particular, it remains ambiguous how to weigh up (i) vs. (ii) after updating on our higher-order intuition. ↩︎
Quote: “Then there's really philosophical cluelessness, where you change who gets born in the coming thousands of years. On that, I'm pretty happy with the standard Bayesian response, which is: yes, any of your actions have some large chance of doing harm through these weird butterfly effects, but the chance of harm cancels out against the chance of actually doing even more good than you expected. So you end up going back to looking at the things you actually can estimate.” ↩︎
Quote: “And if I expect that I have absolutely no idea what the black swans will look like but also have no reason to believe black swans will make this event any more or less likely, then even though I won't adjust my credence further, I can still increase the variance of my distribution over my future credence for this event.” ↩︎