Congrats to the winner TailCalled with their post Some thoughts on "The Nature of Counterfactuals". See the winner announcement post.
I've previously argued that the concept of counterfactuals can only be understood from within the counterfactual perspective.
I will be awarding a $1000 prize for the best post that engages with the idea that counterfactuals may be circular in this sense. The winning entry may be one of the following (these categories aren't intended to be exclusive):
a) A post that attempts to draw out the consequences of this principle for decision theory
b) A post that attempts to evaluate the arguments for and against adopting the principle that counterfactuals only make sense from within the counterfactual perspective
c) A review of relevant literature in philosophy or decision theory
d) A post that restates already existing ideas in a clearer or more accessible manner (I don't think this topic has been explored much on LW, but it may have in explored in the literature on decision theory or philosophy)
Feel free to ask me for clarification about what would be on or off-topic. Probably the main thing I'd like to see is substantial engagement with this principle. The bounty is for posts that engage with the notion that counterfactuals might only make sense from within a counterfactual perspective. I have written on this topic, but the competition isn't limited to posts that engage with my views on this topic. It's perfectly fine to engage with other arguments for this proposition if, for example, you find someone arguing in favour of this in the philosophical/mathematical literature or Less Wrong.
If someone submits a high-quality post that only touches on this issue tangentially, but someone else submits an only okayish post that tries to deeply engage with this issue, then I would likely award it to the latter as I'm trying to incentivise more engagement with this issue rather than just high-quality posts generally. If the bounty is awarded to an unexpected submission, I expect this to be the main cause.
I will be awarding an additional $100 for the best short-form post on this topic. This may be a LW Shortform post, a public Facebook post, a Twitter thread, ect (I'm not going to include Discord/Slack messages as they aren't accessible).
Why do I believe in this principle?
Roughly, my reasons are as follows:
- Rejecting David Lewis' Counterfactual Realism as absurd and therefore concluding that counterfactuals must be at least partially a human construction: either a) in the sense of them being an inevitable and essential part of how we make sense of the world by our very nature or b) in the sense of being a semi-arbitrary and contingent system that we've adopted in order to navigate the world
- Insofar as counterfactuals are inherently a part of how we interpret the world, the only way that we can understand them is to "look out through them", notice what we see, and attempt to characterise this as precisely as possible
- Insofar as counterfactuals are a somewhat arbitrary and contingent system constructed in order to navigate the world, the way that the system is justified is by imagining adopting various mental frameworks and noticing that a particular framework seems like it would be useful over a wide variety of circumstances. However, we've just invoked counterfactuals twice: a) by imagining adopting different mental frameworks b) by imagining different circumstances over which to evaluate these frameworks.
- In either case, we seem to be unable to characterise counterfactuals without depending on already having the concept of counterfactuals. Or at least, I find this argument persuasive.
Why do I believe this is important?
I've argued for the importance of agent meta-foundations before. Roughly, there seems to be a lot of confusion about what counterfactuals are and how to construct them. I believe that much of this confusion would be cleared up if we can sort out some of these foundational issues. And the claim that counterfactuals can only be understood from an interior perspective is one such issue.
Why am I posting this bounty?
I believe in this idea, but:
- I haven't been able to dedicate nearly as much to time exploring this as I would like in between all of my other commitments
- Working on this approach just by myself is kind of lonely and extremely challenging (for example, it's hard to get good quality feedback)
- I suspect that more people would be persuaded that this was a fruitful approach if this principle was presented to them in a different light.
How do I submit my entry?
Make a post on LW or the Alignment forum, then add a link in the comments below. I guess I'm also open to private submissions. Ideally, you should mention that you're submitting your post for the bounty just to make sure that I'm aware of it.
When do I need to submit by?
I'm currently planning to set the submission window to 3 months from the date of this post (that would be the 1st of April, but let's make it April 2nd so people don't think this competition is some kind of prank). Submissions after this date may be refused.
How will this be judged?
I've written on this topic myself, so this probably biases me in some ways, but $1000 is a small enough amount of money that it's probably not worthwhile looking for external judges.
Some Background Info
I guess I started to believe that counterfactuals were circular when I started to ask questions like, "What actually are these things we call counterfactuals?". I noticed that they didn't seem to exist in a literal sense, but that we also seem to be unable to do without them.
Some people have asked why the Bayesian Network approach suggested by Judea Pearl is insufficient (including in the comments below). This approach is firmly rooted in Causal Decision Theory (CDT). Most people on LW have rejected CDT because of its failure to handle Newcomb's Problem.
MIRI has proposed Functional Decision Theory (FDT) as an alternative, but this theory is dependent on logical counterfactuals and they haven't figured out exactly how to construct these. While I don't exactly agree with the logical counterfactual framing, I agree that these kinds of exotic decision theory problems require us to create a new notion of counterfactuals. And this naturally leads to questions about what counterfactuals really are which I see as further leading to the conclusion that they are circular.
I can see why many people are sufficiently skeptical of the notion of counterfactuals being circular that they dismiss it out of hand. It's entirely possible that I could be mistaken about this thesis, but for these people, I'd suggest reading Eliezer's post Where Recursive Justification Hits Bottom which argues for a circular epistemology since if you are persuaded by this post, counterfactuals being circular may then be less of a jump.
I'll award the prize assuming that there's at least one semi-decent submission (according to the standards of posts on Less Wrong). If this isn't the case, then I'll donate the money to an AI Safety organization instead. I'd be open to having this money be held in escrow.
I'm intending to award the prize to the top entry, but there's a chance that I split it if I can't make a decision.
Counterpoint: requiring counterfactuals to justify their own use isn't the same as counterfactuals only making sense from within themselves. Response: It's possible to engage in the appropriate symbol manipulation without a concept of counterfactuals, but we can't have a semantic understanding of what we're doing. We can't even describe this process without being to say things like "if given string of symbols s, do y". Similarly, counterfactuals aren't just justified by imagining the consequences of applying different mental over different circumstances, in this case, they are a system for performing well over a variety of circumstances.
I previously wrote a post about reconciling free will with determinism. The metaphysics implicit in Pearlian causality is free will (In Drescher's words: "Pearl's formalism models free will rather than mechanical choice."). The challenge is reconciling this metaphysics with the belief that one is physically embodied. That is what the post attempts to do; these perspectives aren't inherently irreconcilable, we just have to be really careful about e.g. distinguishing "my action" vs "the action of the computer embodying me" in a the Bayes net and distingu... (read more)
I mostly agree with Zack_M_Davis that this is a solved problem, although rather than talking about a formalization of causality I'd say this is a special case of epistemic circularity and thus an instance of the problem of the criterion. There's nothing unusual going on with counterfactuals other than that people sometimes get confused about what propositions are (e.g. they believe propositions have some sort of absolute truth beyond causality because they fail to realize epistemology is grounded in purpose rather than something eternal and external to the... (read more)
I'm still puzzled by your puzzlement.
You are treating httpss://www.greaterwrong.com/posts/T4Mef9ZkL4WftQBqw/the-nature-of-counterfactuals as though it still an open, but as far as I can see, all the issues raised were answered in the comments .
I think this is a solved problem. Are you familiar with the formalization of causality in terms of Bayesian networks? (You have enough history on this website that you've probably heard of it!)
Make observations using sensors. Abstract your sensory data into variables: maybe you have a
weathervariable with possible values
sprinklervariable with possible values
OFF, and a
sidewalkvariable with possible values
DRY. As you make more observations, you can begin to learn statistical relationships between your variables: maybe... (read more)
Partly it's explaining it badly. In addition to the points listed above, there's also issues like focusing entirely on rung 2 causality and disregarding rung 3 causality, which is arguably the truer kind of causality.
I assume that here we are understanding the graph formalism sufficiently broadly as to include e.g. differential equations, as otherwise there's definitely a problem already there. And in the same vein, for most problems both DAGs and differential equations are too rigid/vector-spacey to work, and we probably need new formalisms that can better handle systems with varying structure of variables.
Regardless, I don't think the question of how one would ... (read more)
How much are you interested in a positive vs normative theory of counterfactuals? For example, do you feel like you understand how humans do counterfactual reasoning, and how and why it works for them (insofar as it works for them)? If not, is such an understanding what you're looking for? Or do you think humans are not perfect at counterfactual reasoning (e.g. maybe because people disagree with each other about Newcomb's problem etc.) and there's some deep notion of "correct counterfactual reasoning" that humans are merely approximating, and the deeper "c... (read more)
I think there is "machinery that underlies counterfactual reasoning" (which incidentally happens to be the same as "the machinery that underlies imagination"). My quote above was saying that every human deploys this machinery when you ask them a question about pretty much any topic.
I was initially assuming (by default) that if you're trying to understand counterfactuals, you're mainly trying to understand how this machinery works. But I'm increasingly confident that I was wrong, and that's not in fact what you're interested in. Instead it seems that your interests are more like "how would an AI, equipped with this kind of machinery, reach correct conclusions about the world?" (After all, the machinery by itself can lead to both correct and incorrect conclusions—just as "thinki... (read more)
My entry. Ultimately I'm not sure whether I agree or disagree with your point, but I hope I've bought up some valuable things.
I'm not sure how strong you are in physics; the "Causality is real, counterfactuals are not" section is a brief summary of some fairly abstract and general properties of physics, so we might need to discuss it further in the comments if they do not immediately ring true to you.
I've spent some hours yesterday writing an entry for this competition, but before publishing it I thought it might be best for me to try to talk briefly about my thoughts in a comment here. I think my post would go under this heading:
Specifically, I think I disagree with your thesis that it counterfactuals only make sense from a counterfactual perspective. Here's a sketch of my reason (which... (read more)
Counterfactuals (in the potential outcome sense used in statistics) and Pearl's structural equation causality semantics are equivalent.
Oh hey, I already have slides for this.
Here you go: https://www.lesswrong.com/posts/vuvS2nkxn3ftyZSjz/what-is-a-counterfactual-an-elementary-introduction-to-the
I took the approach: if I very clearly explain what counterfactuals are and how to compute them, then it will be plain that there is no circularity. I attack the question more directly in a later paragraph, when I explain how counterfactual can be implemented in terms of two simpler operations: prediction and intervention. And that's exactly how it is implemented in our causal probabilis... (read more)
My entry. Focuses on the metaphysics of counterfactuals arguing that there are two types based upon two different possible states of a person's mental model of causal relationships. This agrees with circularity. In general, I concur with principles 1-4 which you outline. My post hits on a bit of criteria a) b) and d).
I'll make a counter-claim and say that most people on LW in fact have rejected the use of Newcomb's Problem as a test that will say something useful about decision theories.
That being said, there is definitely a sub-community which believes deeply in the relevance of Newcomb... (read more)
I think this goes too far. We can give an account of counterfactuals from assumptions of symmetry. This account is unsatisfactory in many ways - for one thing, it implies that counterfactuals exist much more rarely than we want them to. Nonetheless, it seems to account for some properties of a counterfactual and is able to stand up without counterfactual assumptions to support it. I think it also provides an interesting lens for exam... (read more)
What are you trying to get/do? I'm asking very seriously, as I can't quite tell where we land between philosophy of language, human behaviour and cognition, AI architecture or some unification problem of them.
From philosophy of language perspective, I personally like to argue that hypotheticals in past tense are just wrong, but are used in the same way present and future tense versions are: expressing internal belief about how causality will play out for the sake of aligning them in a group.
I'm aware of other approaches, but that has a convenient pro... (read more)
There are at least three possibilities. David Lewis level realism, where counterfactual worlds seem fully real to their inhabitants, is an extreme. Moderate realism abou... (read more)
So, this post only deals with agent counterfactuals (not environmental counterfactuals), but I believe I have solved the technical issue you mention about the construction of logical counterfactuals as it concerns TDT. See: https://www.alignmentforum.org/posts/TnkDtTAqCGetvLsgr/a-possible-resolution-to-spurious-counterfactuals
I have fewer thoughts about environmental counterfactuals but think a similar approach could be used to make statements along those lines, i.e. construct alternate agents receiving a different observation about the world. I'm not sure... (read more)