Blame Theory

[-]Scott Alexander16y30

Say there's a perfect person who does everything e can to create a perfect society, and really does it well, to the limits of er ability, but no one else listens and so a perfect society is not created. In fact, everyone else is hopelessly evil and society doesn't change at all as a result of er efforts.

There's a second person who sits at home all day and watches TV. Society also doesn't change at all as a result of er efforts.

Would these people still end up with the same level of guilt, given that the difference between the perfect world welfare and their current world welfare is exactly the same? Or am I misunderstanding this post as badly as I feel like I must be?

[-]cousin_it16y20

I think you understood it correctly. If two persons have equal levels of ability - could make equal potential contrubutions to the brave new world - then yes, equal welfare today implies equal guilt. Playing C while everyone else plays D may look noble, but if it has no effect, do we really want to encourage it? Couldn't the first person just look around and find a better use for their time?

[-]Scott Alexander16y30

So if it's possible to do everything exactly perfectly, to the level of a superintelligence calculating how it could most increase world utility and then performing only those actions - and still end up with guilt in a sufficiently hard-to-fix situation - why are you calling this quantity "guilt" at all? It certainly doesn't fit my concept of what guilt is supposed to mean, and judging by the end of your post it doesn't fit yours.

Why not call it "variable X", and note that variable X has no particular correlation to any currently used English term or human emotion?

Also, the Shapley Value looks really interesting, but the wikipedia article you linked to sends me into Excessive Math Panic Mode. If you wanted to explain it in a more understandable/intuitive way, that would make a great topic for an LW post.

[-]cousin_it16y40

The Shapley value has been used on LW several times already: 1, 2. I understand it as follows: imagine a game with many players that can make "coalitions" with each other to win money from the universe, and two "coalitions" joined together can always win no less than they'd have won separately. Then the Shapley value is a way of distributing the maximum total winnings (where everyone cooperates) such that every player and every group of players get no less than they could've won for themselves by defecting (individually or as a group).

(I edited this away, but now Yvain replied to it, so I'm restoring it:) Should we reward a completely ineffectual action? Are you a deontologist?

[-]Scott Alexander16y80

No, but guilt is an inherently deontological concept.

Let me give an example. Actually, your example. Your Hitler voter model. Yeah, it successfully makes the person who voted for Hitler feel guilty. But it also makes the person who didn't vote for Hitler, and maybe did everything e could to stop Hitler before being locked up in a German prison, equally guilty. So it actually makes the exact mistake you're warning against - unless your single vote decides whether or not Hitler gets into power, people who votes for and against Hitler end up equally guilty (if your single vote decides it, then your present welfare is greater and you have less difference between present and perfect welfare).

Guilt is there to provide negative reinforcement for acting in an immoral way. So it's only useful if there's some more moral way you could act that it needs to reinforce you towards. Loading someone who's literally done everything e could with a huge burden of guilt is like chronic pain disorder: if the pain's not there to tell you to stop doing something painful, it's just getting in the way.

And if your brain gives you equal squirts of guilt for voting for Hitler vs. fighting Hitler, guilt fails in its purpose as a motivation not to vote for Hitler, and any AI with a morality engine built around this theory of guilt will vote Hitler if there's any reason to do so at all.

(as for Shapley, I see references to it but not a good explanation of how to derive it and why it works. Maybe that's one of those things that actually can't be explained simply and I ought to bite the bullet and try to parse the wiki article.)

[-]cousin_it16y70

I thought about it a while and your objections are correct. This thing seems to be measuring how much I could regret the current state of the world, not how much I should've done to change it. Added a "WRONG!" disclaimer to the post; hopefully people will still find it entertaining.

[-]bogdanb16y00

It might be helpful to also add your conclusion (i.e., exactly how you think it’s wrong) to the disclaimer. It seems an interesting fact, but I imagine many will miss it by not bothering to read a post marked as “wrong”.

[-]JGWeissman16y00

The Shapley value averages over your marginal contribution to utilities of sub coalitions. The guy who votes against Hitler would be involved in some sub coalitions in which he is the marginal vote that defeats Hitler, and thus would have a positive Shapley value, where the guy who voted for Hitler would be involved in some sub coalitions where he is the marginal vote that elects Hitler, and thus would have a negative Shapley value.

[-]cousin_it16y00

I think Yvain is right and you're wrong. The Shapley value takes as input the whole game, not a certain play of the game, so it doesn't know that you actually voted for Hitler and the other guy didn't.

[-]JGWeissman16y20

The formula for the Shapley value (from the wiki article):

What this means is you take all sub coalition S of the total coalition N, excluding sub coalition that include yourself. Then average over the difference in value of the sub coalition S plus yourself and just the sub coalition S. (The first term in the sum makes it a weighted average depending on the sizes of S and N.) These sub coalitions S, and S plus yourself, did not actually happen, you are considering the counterfactual value of those being the actual coalitions.

The point is that the formula knows how your inclusion in a coalition changes its value.

[-]torekp16y20

If the gain produced by cooperation is negative, the super additivity condition fails to apply and thus so does the Shapley distribution. The "desirable property" number 1 of the wiki, labeled individual fairness, also does not apply. I suppose you could extend the mathematical formula to apply to negative gains, but the question would be whether that distribution satisfied some intuitively appealing set of axioms.

[-]cousin_it16y10

If the cooperative game that we compute the Shapley value from is derived from an adversarial game, superadditivity cannot fail. To get the sum of what they would've got separately, the players just have to play what they would've played separately.

[-]mattnewport16y00

What is this 'guilt' you speak of? Are you a Catholic?

[-]Tenek16y40

Guilt is an added cost to making decisions that benefit you at the expense of others. (Ideally, anyways.) It encourages people to cooperate to everyone's benefit. Suppose we have a PD matrix where the payoffs are: (defect, cooperate) = (3, 0) (defect, defect) = (1, 1) (cooperate, cooperate) = (2, 2) (cooperate, defect) = (0, 3) Normally we say that 'defect' is the dominant strategy since regardless of the other person's decision, your 'defect' option payoff is 1 higher than 'cooperate'.

Now suppose you (both) feel guilty about betrayal to the tune of 2 units: (defect, cooperate) = (1, 0) (cooperate, cooperate) = (2, 2) (defect, defect) = (-1, -1) (cooperate, defect) = (0, 1)

The situation is reversed - 'cooperate' is the dominant strategy. Total payoff in this situation is 4. Total payoff in the guiltless case is 2 since both will defect. In the OP $10-button example the total payoff is $-90, so people as a group lose out if anyone pushes the button. Guilt discourages you from pushing the button and society is better for it.

[-]mattnewport16y50

Guilt is an emotion which probably evolved for something like the purpose you describe. It is triggered by interpersonal interactions and is not under direct conscious control (it wouldn't do its job very well if it was). The OP's suggestion that guilt is something you 'should' feel in response to events outside of interpersonal interactions or your own direct actions is incoherent and reminiscent of the 'Catholic guilt' phenomenon. It appears that Catholicism found a way to train people to feel some kind of generalized guilt for all kinds of strange things beyond it's 'natural' application. This does not appear to be a helpful development.

[-]Vladimir_Nesov16y00

Person's values have only weak control over person's actions, through the intermediary of their stupid deluded mind; the perfect world is not the one where everyone cooperated, far from that. If it's not the person's values that are to blame, handicapped by the shoddy implementation in human mind as they are, and the person is merely a broken tool of those values, what's the point of assigning blame? What lesson does it teach?

Perhaps, the alternative to consider is people who understand this contrast, feel its full extent, and are moved by its moral value to do better. But the contrast given by the argument must meet the expected effect of the argument, which is some orders of magnitude away from what you seek in the impossible counterfactual world, one that isn't real and so isn't worth considering.

LESSWRONG
LW

LESSWRONG
LW

13

13

13