EDIT: this post, like many other posts of mine, is wrong. See the comments by Yvain below. Maybe "Regret Theory" would've been a better title. But I'm keeping this as it is because I like having reminders of my mistakes.

Platitudes notwithstanding, "personal responsibility" doesn't adequately summarize my relationship with the universe. Other people are to blame for some of my troubles because, as Shepherd from MW2 put it, "what happens over here matters over there". Let's talk about it.

When other people's actions can affect you and vice versa, individual utility maximization stops working and you must use game theory. The Prisoner's Dilemma stresses our intuitions by making a mockery of personal responsibility: each player holds the power to screw up the other player's welfare which they don't care about. Economists call such things "externalities", political scientists talk of "special interests". You can press a button that gives you $10 but makes 10000 random people lose $0.01 each (due to environmental pollution or something like that), you care enough to vote for this proposal, other people don't care enough to vote against, haha democracy fail. 

When the shit hits the fan in a multiplayer setting, we clearly need a theory for assigning blame in correct proportion. For example, what should we make of the democratic credo that people are responsible for the leaders they have? Exactly how much responsible? How many people did I personally kill by voting for Hitler, and how is this responsibility shared between Hitler and me? The "naive counterfactual" answer goes like this: if I hadn't voted for Hitler, he'd still have been elected (since I wasn't the marginal deciding voter), therefore I'm blameless. Clearly this answer is not satisfactory. We need more sophisticated game theory concepts.

First of all, we would do well to assume transferable utility. To understand why, consider Clippy. Clippy is perfectly willing to kill a million Armenians to gain one paperclip. We can blame him (her?) for it all day, but it's probably safe to say that Clippy's internal sense of guilt isn't denominated in Armenians. We must reach a mutual understanding by employing a common currency of guilt, which is just another way of saying "convertable utils". You feel guilty toward me = you owe me dough. Too bad, knowing you, you probably won't pay.

Our second assumption goes like this: rational actions cannot be immoral. Given our knowledge of game theory, this sounds completely callous and bloodthirsty, but in the transferable utility case it's actually quite tame. You have no incentive to screw Bob over in PD if you'll be sharing the proceeds anyway. The standard procedure for sharing will be, of course, the Shapley value.

This brings us to today's ultimate conclusion: Blame Theory. Imagine that instead of killing all those Gypsies, the evil leader and the stupid voters together sang kumbaya and built a city on a hill. The proceeds of that effort would be divided according to everyone's personal contributions using the standard Shapley construction (taking into account each group's counterfactual non-cooperation, of course). You, dear reader, would have gotten richer by two million dollars, instead of hiding in a bomb shelter while the ground shakes and your nephew is missing. And calculating the difference between your personal welfare in the perfect world where everyone cooperated, and the welfare you actually have right now given that everyone acted as they did, gives you the extent of your personal guilt. You can't push it away, it's yours. Sorry.

Don't know about you, but I'm genuinely scared by this result.

On one hand, it agrees with intuition in all the right ways: the "naive counterfactual" voter still carries non-zero guilt because (s)he was part of the collective that elected a monster, and the little personal guilts add up exactly to the total missed opportunity of all society, and... But on the other hand, this procedure gives every one of us a new and unexpectedly harsh yardstick to measure ourselves by. It probably makes me equivalent to a serial murderer already. I've long had a voice in the back of my head telling me it was possible to do better, but Blame Theory brings the truth into sharp and painful focus. I'm not sure I wanted that when I set out to solve this particular problem...

New to LessWrong?

New Comment
16 comments, sorted by Click to highlight new comments since: Today at 8:53 AM

Say there's a perfect person who does everything e can to create a perfect society, and really does it well, to the limits of er ability, but no one else listens and so a perfect society is not created. In fact, everyone else is hopelessly evil and society doesn't change at all as a result of er efforts.

There's a second person who sits at home all day and watches TV. Society also doesn't change at all as a result of er efforts.

Would these people still end up with the same level of guilt, given that the difference between the perfect world welfare and their current world welfare is exactly the same? Or am I misunderstanding this post as badly as I feel like I must be?

I think you understood it correctly. If two persons have equal levels of ability - could make equal potential contrubutions to the brave new world - then yes, equal welfare today implies equal guilt. Playing C while everyone else plays D may look noble, but if it has no effect, do we really want to encourage it? Couldn't the first person just look around and find a better use for their time?

So if it's possible to do everything exactly perfectly, to the level of a superintelligence calculating how it could most increase world utility and then performing only those actions - and still end up with guilt in a sufficiently hard-to-fix situation - why are you calling this quantity "guilt" at all? It certainly doesn't fit my concept of what guilt is supposed to mean, and judging by the end of your post it doesn't fit yours.

Why not call it "variable X", and note that variable X has no particular correlation to any currently used English term or human emotion?

Also, the Shapley Value looks really interesting, but the wikipedia article you linked to sends me into Excessive Math Panic Mode. If you wanted to explain it in a more understandable/intuitive way, that would make a great topic for an LW post.

The Shapley value has been used on LW several times already: 1, 2. I understand it as follows: imagine a game with many players that can make "coalitions" with each other to win money from the universe, and two "coalitions" joined together can always win no less than they'd have won separately. Then the Shapley value is a way of distributing the maximum total winnings (where everyone cooperates) such that every player and every group of players get no less than they could've won for themselves by defecting (individually or as a group).

(I edited this away, but now Yvain replied to it, so I'm restoring it:) Should we reward a completely ineffectual action? Are you a deontologist?

No, but guilt is an inherently deontological concept.

Let me give an example. Actually, your example. Your Hitler voter model. Yeah, it successfully makes the person who voted for Hitler feel guilty. But it also makes the person who didn't vote for Hitler, and maybe did everything e could to stop Hitler before being locked up in a German prison, equally guilty. So it actually makes the exact mistake you're warning against - unless your single vote decides whether or not Hitler gets into power, people who votes for and against Hitler end up equally guilty (if your single vote decides it, then your present welfare is greater and you have less difference between present and perfect welfare).

Guilt is there to provide negative reinforcement for acting in an immoral way. So it's only useful if there's some more moral way you could act that it needs to reinforce you towards. Loading someone who's literally done everything e could with a huge burden of guilt is like chronic pain disorder: if the pain's not there to tell you to stop doing something painful, it's just getting in the way.

And if your brain gives you equal squirts of guilt for voting for Hitler vs. fighting Hitler, guilt fails in its purpose as a motivation not to vote for Hitler, and any AI with a morality engine built around this theory of guilt will vote Hitler if there's any reason to do so at all.

(as for Shapley, I see references to it but not a good explanation of how to derive it and why it works. Maybe that's one of those things that actually can't be explained simply and I ought to bite the bullet and try to parse the wiki article.)

I thought about it a while and your objections are correct. This thing seems to be measuring how much I could regret the current state of the world, not how much I should've done to change it. Added a "WRONG!" disclaimer to the post; hopefully people will still find it entertaining.

It might be helpful to also add your conclusion (i.e., exactly how you think it’s wrong) to the disclaimer. It seems an interesting fact, but I imagine many will miss it by not bothering to read a post marked as “wrong”.

The Shapley value averages over your marginal contribution to utilities of sub coalitions. The guy who votes against Hitler would be involved in some sub coalitions in which he is the marginal vote that defeats Hitler, and thus would have a positive Shapley value, where the guy who voted for Hitler would be involved in some sub coalitions where he is the marginal vote that elects Hitler, and thus would have a negative Shapley value.

I think Yvain is right and you're wrong. The Shapley value takes as input the whole game, not a certain play of the game, so it doesn't know that you actually voted for Hitler and the other guy didn't.

The formula for the Shapley value (from the wiki article):

What this means is you take all sub coalition S of the total coalition N, excluding sub coalition that include yourself. Then average over the difference in value of the sub coalition S plus yourself and just the sub coalition S. (The first term in the sum makes it a weighted average depending on the sizes of S and N.) These sub coalitions S, and S plus yourself, did not actually happen, you are considering the counterfactual value of those being the actual coalitions.

The point is that the formula knows how your inclusion in a coalition changes its value.

If the gain produced by cooperation is negative, the super additivity condition fails to apply and thus so does the Shapley distribution. The "desirable property" number 1 of the wiki, labeled individual fairness, also does not apply. I suppose you could extend the mathematical formula to apply to negative gains, but the question would be whether that distribution satisfied some intuitively appealing set of axioms.

If the cooperative game that we compute the Shapley value from is derived from an adversarial game, superadditivity cannot fail. To get the sum of what they would've got separately, the players just have to play what they would've played separately.

What is this 'guilt' you speak of? Are you a Catholic?

Guilt is an added cost to making decisions that benefit you at the expense of others. (Ideally, anyways.) It encourages people to cooperate to everyone's benefit. Suppose we have a PD matrix where the payoffs are: (defect, cooperate) = (3, 0) (defect, defect) = (1, 1) (cooperate, cooperate) = (2, 2) (cooperate, defect) = (0, 3) Normally we say that 'defect' is the dominant strategy since regardless of the other person's decision, your 'defect' option payoff is 1 higher than 'cooperate'.

Now suppose you (both) feel guilty about betrayal to the tune of 2 units: (defect, cooperate) = (1, 0) (cooperate, cooperate) = (2, 2) (defect, defect) = (-1, -1) (cooperate, defect) = (0, 1)

The situation is reversed - 'cooperate' is the dominant strategy. Total payoff in this situation is 4. Total payoff in the guiltless case is 2 since both will defect. In the OP $10-button example the total payoff is $-90, so people as a group lose out if anyone pushes the button. Guilt discourages you from pushing the button and society is better for it.

Guilt is an emotion which probably evolved for something like the purpose you describe. It is triggered by interpersonal interactions and is not under direct conscious control (it wouldn't do its job very well if it was). The OP's suggestion that guilt is something you 'should' feel in response to events outside of interpersonal interactions or your own direct actions is incoherent and reminiscent of the 'Catholic guilt' phenomenon. It appears that Catholicism found a way to train people to feel some kind of generalized guilt for all kinds of strange things beyond it's 'natural' application. This does not appear to be a helpful development.

Person's values have only weak control over person's actions, through the intermediary of their stupid deluded mind; the perfect world is not the one where everyone cooperated, far from that. If it's not the person's values that are to blame, handicapped by the shoddy implementation in human mind as they are, and the person is merely a broken tool of those values, what's the point of assigning blame? What lesson does it teach?

Perhaps, the alternative to consider is people who understand this contrast, feel its full extent, and are moved by its moral value to do better. But the contrast given by the argument must meet the expected effect of the argument, which is some orders of magnitude away from what you seek in the impossible counterfactual world, one that isn't real and so isn't worth considering.