adding such complexity to a theory makes it far less useful to actually model human behavior, both on normative and descriptive levels.
Where did Yudkowsky or anyone else say that the FDT was supposed to model human behavior? It is to prescribe behaviors which I expect to be similar to ethical ones, like "Don't loot the other universe even if it's inhabited only by a paperclip optimizer".
It is to prescribe behaviors which I expect to be similar to ethical ones, like "Don't loot the other universe even if it's inhabited only by a paperclip optimizer".
Also (apologies for the double comment), while directionally correct, I think this may be a bit of a misrepresentation of the aim of their decision theory (Yudkowsky and Soares would have to clarify their exact intent).
Decision theories, including FDT, take utility functions as external to the theories themselves. If an agent gained utility by anti-social, destructive acts, every decision theory would prescribe positive values to those acts by definition. I don't think FDT is intended to contradict this definition (indeed, their paper implicitly seems to accept it). Where I think you are directionally correct is that it aims to recommend the decision that would optimize utility from cooperating where greater (the first dilemma they describe in their paper is a clear example of this). If one's ethical values are encapsulated within a utility function, than whatever optimizes utility should also be the most ethical (and if you are a utilitarian, the optimal ethical and optimal utility decisions are definitionally identical).
I should have been clear there, I wasn't trying to imply that the descriptive utility was a claim they made. i meant that in the sense that "it fails to serve a purpose which traditional decision theories are intended for (and are useful at)" not that "it fails to serve the purpose it is designed for."
I would argue it fails on its own terms (at least sometimes) as a normative theory (which is a stated goal for the theory given by Yudkowsky & Soares) but as you say it isn't trying to be a descriptive theory. It not being useful as a descriptive theory is just a point of comparison of its merits against traditional models.
Edit: Also, in case it isn't obvious, I have an economic background so my primary interest in it would be if it outperforms the uses of decision theory (both normative and descriptive) in economics.
I apologize for the (mild) clickbait, I will do my best to justify it later. As an introductory note, this discussion is principally motivated by a previous discussion of decision theory given by Yudkowsky and Soares in their various writings, including their paper, and here on the wiki. I am going to discuss in the context of the three decision theories outlined in the wiki (causal decision theory (CDT) evidentiary decision theory (FDT), and logical decision theory (LDT)). I will try to cover context where relevant. I am also, in large part, responding to the case example of voting which Yudkowsky has discussed here. Beyond this disclaimer, I am going to focus on other, general topics, before circling back to a more particular critique.
Introduction to Utility: The Classic Economic View of Decisions Summarized
Note, the first part here is mostly summarizing, if you are already passingly familiar with behavioral economics, you should be able to skim ahead.
First, an obvious question: what are we trying to model? As Anger noted, economists typically are concerned with taking some set of goals as granted and modeling either how people do behave pursuant to those goals and/or how they should behave if they want to achieve those goals, that is we are concerned both with a descriptive theory of human behavior as well as a normative theory (i.e. a theory of what is rational behavior). One might hope these come together.
So, how do we actually model utility? Formally, we say a function as from a set of alternatives into the set of real numbers is a utility function representation preference relationship ⪰ just in case .[1] More simply, if and only if I prefer x to y, then . In more practical terms, my utility function for (b) bananas might be , which represents diminishing returns to additional bananas (I would have a utility of 1 for one banana, √2 for two bananas, 2 for four bananas and so forth). Even in this simplest case we see a basic nuance in nuance in preferences--I am not totally agnostic between bananas.
Now, for decision making, let's say we are considering some act A. If the outcomes are certain, we evaluate the expected utility straightforwardly as: . Where represents the particular action and represents the expected consequence. More realistically, our expectations for the consequences outcomes are probabilistic, rather than certain. Traditionally, as you will see in most economics textbooks, the equation to calculate expected utility under uncertainty given as: .[2] That is, our expected utility of some action ( ) is the sum of the probabilities of each consequence given an action times the utility of those consequences.
Let's take an example. Imagine I have 2 bananas, someone offers me an opportunity to gamble on a fair coinflip. If it is heads, they will give me a banana. If it is tails, I give them a banana.
If we take the simple case, we can see get:
An alternative way to think of it is adding up the utility for the branches of a decision tree:
So, if we are rationally trying to maximize utility, we shouldn't bet even though the expected value is 2 bananas in both. This also is a common way of modeling risk aversion, at its most basic level.
We understand most decisions are not so simple nor are the utilities involved. People's values, goals and the benefits they get from things are varied in complex ways, but even so as a simple tool we can make simplifying assumptions that can be very powerful in making statements about behavior and modeling things.
Should I vote? Applying Expected Utility to the Case of Voting:
Suppose, to steal an example, you are among 1,001 people voting in a regional election for candidates Kang and Kodos. Based on an examination of polling results, past election data and so forth, you estimate that that not including your vote there is a 59.999% chance Kang wins, a 40% chance Kodos wins and a 0.001% chance they tie. That is, your vote has a 1 in 100,000 chance of influencing the results. If there is a tie and you don't vote, it can go either way, 50/50. There is also some transaction cost of voting, say T. For simplicity, let's say you prefer Kang and my expectation if he wins is higher. That is where .
The most naive approach would be to model my expected utility as (ignoring the case where one votes for Kodos, since that is strictly inferior):
Leaving the algebra as an exercise for the reader we can conclude that
That is, if we are optimizing our utility, we should only vote for Kang if we the utility we get from Kang winning over Kodos is 200,000 times greater than the utility cost of voting.
This level of analysis may be uncomfortable for some considering voting. Unless your preference for one candidate over another is massive and the cost of voting is tiny, the vanishing low chances of your vote being pivotal mean the expected utility of voting--in this simple approach--are likely to be negative.
So, is our calculation wrong? Why do people still vote even when the odds of their vote being pivotal are trivial? Is there a reason why a rational agent might still vote? People's voting behavior doesn't seem to be based on this, indeed often in places where votes are less likely to be pivotal voter turn out is still quite significant (think red and blue states during presidential elections).
The simple answer a behavioral economist would offer is just: "our simplified utility function is excluding important factors relevant to human behavior." True, there is some cost to voting, but that is not the only effect. Real people have messy utility functions: people get some value out of voting. In reality, we have positive signaling effects from voting (see the 'I Voted' stickers) and people place an inherent value on civic voting. Let's group these positive values of voting together as , where E includes all of the additional value we personally place on voting and our cumulative estimate of the signaling effects and any other externalities of voting in a particular election. Repeating the exercise we can reach the conclusion:
That is, I should vote if the value I get from voting (including signaling benefits, personal valuations of being civically involved, etc) plus 0.000005 * the utility I get from my preferred outcome is greater than the utility cost of voting.
Lessons from the Standard (CDT) Model:
This seems to pretty closely model what we observe. People who value voting highly, are more likely to vote, where there are greater signaling effects, people vote more and where the cost of voting is less people vote more.
To individuals, it offers a pretty simple recommendation: "If you value voting and the benefits from voting, plus (likely miniscule) benefit your votes provides towards your preferred outcome, then you should vote if you estimate those have greater utility than whatever it costs you, in time and effort, to actually vote." This, intuitively, feels sensible and does seem to match what we see. Additionally, it offers some obvious recommendations to political campaigns and policy makers that want to increase turn out. IF you want to increase voter turnout, you should emphasize the impact of individual votes, make voters feel their vote is likely to have a substantial impact, instill in your voter base stronger civic commitments, openly promote signaling effects for voting (e.g., posting 'I voted stickers'), and lower the costs associated with voting as much as possible (e.g., provide free transport to voting centers). These (and, maliciously their opposite) are behaviors we see pretty clearly in the real world. Politicians fight to make voting easier for their supporters and more difficult for competitors is another behavior that we would expect and do observe in reality.
A Better Theory?
Rather than asking the question of the prior discussion of the utility if voting (i.e. "what do I expect the utility consequences of my decision to vote or not to be") Logical and Functional Decision Theories says you should instead include logical counterfactuals and think of optimizing utility from the perspective of optimizing utility of agents 'like' you. This amounts to that the question you should ask, according to LDT is "what would happen if people like you voted." This, proponents argue, is preferable and leads to more voting. But does it?
On the side just theory: It gives a first and very obvious recommendation absent under our previous analysis: "the more people that are similar to you, the more value you should place vote." This recommendation does not match neatly with intuition (at least not with my intuition) and, in fact, implicitly seems to run counter to proponents' statements that "if you don't expect any of the elections to be close." Qualitatively, it seems to endorse a factional approach to politics which goes against my own moral intuitions and argue against the value of voting for people who are unique.
It additionally leads to some weirdness in the scenarios we discussed, if I know the odds I estimated are accurate (at the day before voting, I cannot change other's voting behavior) I should assess whether to vote or not without reference to the odds I know to be true, even acting as though a counter factual is true. This makes intuitive sense in extreme thought experiments, such as the transparent Newcomb's box, but when asked to actually assess my behavior by considering possibilities I know to be false, feels (intuitively) a lot more difficult. If you are the kind of person who readily endorses protest candidates, you may find the reasoning more sympathetic, but I am personally less able to disentangle myself from my expectations of conditional probability.
This weirdness also leads directly into the practical consideration. My confidence in the empirical evidence is much greater, there are various ways I can with some degree of confidence make estimates about voting behavior and conclude what the probability of my vote being pivotal is. How do I empirically answer the question of those whose behavior is correlated with my own? It requires vastly more assumptions or data to justify and the reality seems to be that anyone making those estimates is in a small majority (and thus should be relatively less inclined to vote!). There doesn't seem to be any empirical way to ascertain this relationship, and even proponents admit their attempts to offer methods to answer the question (no one to my knowledge has actually tried to go about estimating it) are "not great".
To end, let's at least try to think through what an FDT agent should do. Say I am in the former situation and an FDT agent who otherwise makes all the same estimates. I know a handful of people who also think of themselves as FDT agents, most of them have told me they decided not to vote--on the whole I estimate that FDT agents represent a vanishingly small portion of the voter base. How can I decide whether to vote? As said, I don't think we have enough information to quantify, we can still consider a few possibilities qualitatively; should I:
I don't think there is any clear way of judging these scenarios. We would have to add a lot of assumptions to even begin to formally calculate the expected utility.
Perhaps there is some deep insight I am missing, but straightforwardly it seems that adding such complexity to a theory makes it far less useful to actually model human behavior, both on normative and descriptive levels.
See e.g. Angner, Erik. A course in behavioral economics. Bloomsbury Publishing, 2020.
Ibid.