Related and required reading in life (ANOIEAEIB): The Copenhagen Interpretation of Ethics

Epistemic Status: Trying to be minimally judgmental

Spoiler Alert: Contains minor mostly harmless spoiler for The Good Place, which is the best show currently on television.

The Copenhagen Interpretation of Ethics (in parallel with the similarly named one in physics) is as follows:

The Copenhagen Interpretation of Ethics says that when you observe or interact with a problem in any way, you can be blamed for it. At the very least, you are to blame for not doing more. Even if you don’t make the problem worse, even if you make it slightly better, the ethical burden of the problem falls on you as soon as you observe it. In particular, if you interact with a problem and benefit from it, you are a complete monster. I don’t subscribe to this school of thought, but it seems pretty popular.

I don’t say this often, but seriously, read the whole thing.

I do not subscribe to this interpretation.

I believe that the majority of people effectively endorse this interpretation. I do not think they endorse it consciously or explicitly. But they act as if it is true.

Another aspect of this same phenomenon is how most people view justice.

Almost everyone agrees justice is a sacred value. That it is good and super important. Justice is one of the few universally agreed upon goals of government. Justice is one of the eight virtues of the avatar. Justice is up there with truth and the American way. No justice, no peace.

But what is justice? Or rather, to avoid going too deeply into an infinitely complex philosophical debate millenniums or eons old, how do most people instinctively model justice in broad terms?

In a conversation last night, this was offered to me (I am probably paraphrasing due to bad memory, but it’s functionally what was said), and seems common: Justice is giving appropriate punishment to those who have taken bad action.

I asked whether, in this person’s model, the actions needed to be bad in order to be relevant to justice. This prompted pondering, after which the reply was that yes, that was how their model worked.

I then asked whether rewarding a good action counted as justice, or failing to do so counted as injustice, using the example of saving someone’s life going unrewarded.

We can consider three point-based justice systems.

In the asymmetric system, when bad action is taken, bad action points are accumulated. Justice punishes in proportion to those points to the extent possible. Each action is assigned a non-negative point total.

In the symmetric system, when any action is taken, good or bad, points are accumulated. This can be and often is zero, is negative for bad action, positive for good action. Justice consists of punishing negative point totals and rewarding positive point totals.

In what we will call the Good Place system (Spoiler Alert for Season 1), when any action is taken, good or bad, points are accumulated as in the symmetric system. But there’s a catch (which is where the spoiler comes in). If you take actions with good consequences, you only get those points if your motive was to do good. When a character attempts to score points by holding open doors for people, they fail to score any points because they are gaming the system. Gaming the system isn’t allowed.

Thus, if one takes action even under the best of motives, one fails to capture much of the gains from such action. Second or higher order benefits, or surprising benefits, that are real but unintended, will mostly not get captured.

The opposite is not true of actions with bad consequences. You lose points for bad actions whether or not you intended to be bad. It is your responsibility to check yourself before you wreck yourself.

When (Spoiler Alert for Season 3) an ordinary citizen buys a tomato from a supermarket, they are revealed to have lost twelve points because the owner of the tomato company was a bad guy and the company used unethical labor practices. Life has become too complicated to be a good person. Thus, since the thresholds never got updated, no one has made it into The Good Place for centuries.

The asymmetric system is against action. Action is bad. Inaction is good. Surprisingly large numbers of people actually believe this. It is good to be you, but bad to do anything. 

The asymmetric system is not against every action. This is true. But effectively, it is. Some actions are bad, some are neutral. Take enough actions, even with the best of intentions, even with fully correct knowledge of what is and is not bad, and mistakes will happen.

So any individual, any group, any company, any system, any anything, that takes action, is therefore bad.

The law by design works that way, too. There are increasingly long and complex lists of actions which are illegal. If you break the law, and anyone who does things will do so by accident at some point, you can be prosecuted. You are then prosecuted for the worst thing they can pin on you. No amount of other good deeds can do more than mitigate. Thus, any sufficiently rich investigation will judge any of us who regularly take meaningful action to be bad.

If you can be sued for the bad consequences of a medical procedure, potentially for ruinous amounts, but cannot collect most of the huge benefits of successful procedures, you will engage in defensive medicine. Thus, lots of defensive medicine. Because justice.

If, as was done in the past, the engineer and his family are forced to sleep under the bridge after it is built, so that they will be killed if it falls down, you can be damn sure they’re going to build a safe bridge. But you’d better want to pay for a fully bulletproof bridge before you do that.

Skin in the game is necessary. That means both being at risk, and collecting reward. Too often we assign risk without reward.

If one has a system whereby people are judged only by their bad actions, or by their worst single action, what you have is a system that condemns and is against all action.

Never tweet.

Also see privacy and blackmail.

The symmetric system is in favor of action. If no one ever took any action, we would not have nice things and also all die. If people generally took fewer actions, we would have less nice things and be worse off. If one gets full credit for the good and bad consequences of one’s actions, we will provide correct incentives to encourage action.

This, to me, is also justice.

A symmetric system can still count bad consequences as larger than similar good consequences to a large extent (e.g. saving nine people from drowning does not give one enough credits to murder a tenth), and we can punish locally bad intent on top of direct consequences, without disturbing this. Action is on net a very good thing.

The Good Place system works well for simple actions with mostly direct consequences. One then, under normal circumstances, gets credit for the good and the bad. It also has a great feature, which is that it forces the action via a high required threshold. You need a lot of points to pass a binary evaluation when you die. Sitting around doing nothing is a very bad idea.

The problem comes in when there are complex indirect consequences that are hard to fully know or observe.

Some of the indirect consequences of buying a tomato are good. You don’t get credit for those unless you knew about them, because all you were trying to do was buy a tomato. Knowing about them is possible in theory, but expensive, and doesn’t make them better. It only makes you know about them, which only matters to the extent that it changes your decisions.

Some of the indirect consequences of buying a tomato are bad. You lose those points.

Thus, when you buy a tomato and thus another customer can’t buy a tomato, you get docked. But when you buying a tomato increases the store’s estimated demand for tomatoes, so they order more and don’t run out next week, and a customer gets to buy one (and the store stays in business to provide even more tomatoes), you don’t get rewarded.

Better to not take the shopping action.

No wonder people make seemingly absurdist statements like “there is no ethical consumption under capitalism.”

Under this philosophy, there is no ethical action under complexity. Period.

I get that complexity is bad. But this is ridiculous.

Compare to the Copenhagen Interpretation of Ethics. If one interacts with a compact, isolated problem, such as a child drowning in a pond, one can reasonably do all one could do, satisfying one’s requirements. If one interacts with or observes a non-compact, non-isolated problem, such as third world poverty, you are probably Mega-Hitler. You cannot both be a good person and have slack.

As a young child, I read the book Be a Perfect Person in Just Three Days. Spoiler alert, I guess? The protagonist is given a book with instructions on how to be a perfect person. The way to do so is to take progressively less action. First day you take symbolic action, wearing broccoli around your neck. Second day you take inaction, by fasting. Third day, you do nothing at all except drink weak tea and go to the bathroom. 

That makes you ‘perfect.’

Because perfect means a score of exactly zero points.

Asymmetric systems of judgment are systems for opposing all action.



New Comment
103 comments, sorted by Click to highlight new comments since: Today at 8:14 AM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Overall the disagreement underlying this post is obscured by a set of common names for very different protocols.

Under one protocol, praise and blame are tools for encouraging behavior the community wants and discouraging behavior the community does not want. If these categories are not manipulated for other motives, we have simulacra level 1 morality. This is the straightforward interpretation under which – if you hold it consistently and think it’s the predominant norm – the “Copenhagen interpretation” seems obviously perverse, legalizing blackmail seems obviously helpful, etc.

It gets more complicated if you think that the community may be mistaken about matters of praise or blame, and that someone might be manipulating these perceptions for their own ends. Now we’re in simulacra level 2 or 3, and people playing game 1 need a moral theory that helps them cooperate with each other, resist, evade, or recover from attacks by level-2 players, and avoid wasting their time interacting with level 3. This is the position of the Psalms.

Once manipulating the perception of praise or blame becomes the dominant game, we’re in simulacra level 4.

Level 4 focuses on blame rather than praise becaus... (read more)

(Replying to the last two paragraphs)

Agreed. Several things one could say here.

1. It is not common knowledge that the level-4 simulacrum of justice is a level-4 simulacrum. Or even that it is not a level-1. There are people honestly trying to do level-1 justice using a mostly level-4 simulacrum, or a mix of all levels, etc. I feel like this error was present and somewhat ubiquitous, for various reasons good and bad, long before L-4 took over the areas in question, and its origin often *was* usefully thought of as a technical error. Its final one-winged-angel form is something else.
2. Even if something is not a technical error in the sense that no one was trying to solve a given technical problem, it is still true in many cases, including this one, that it claims that it *is* trying to solve the problem. Pointing out that it’s doing a piss-poor job of that can create knowledge or ideally common knowledge that allows the remaining lower-level players to identify and coordinate against it, or at least avoid making the mistake in their own thinking and realize what they are up against.
3. It can lead to potential ways out. One can imagine forcing common knowledge of being L-4 accelerati... (read more)

1. I think level-4 simulacrum morality is VERY old and has existed for a long time in uncomfortable confused competition with the other kinds. I agree that this is not common knowledge, and never has been. I’d like to hear more about why you think the situation is new.

(It’s plausible to me that something’s changed recently, in response to the Enlightenment, and that something changed with the initial spread of Christianity, and that something else changed with the initial growth of cities and centralized cults.)

2. I agree. I think it’s more helpful if we additionally clarify that while there’s not really a good-faith reason to stay confused about this, many people have a strong perceived motive to stay confused, so the persistence of confusion is not strong evidence that our apparently decisive arguments are missing an important technical point. (Also, it’s better if noticing this doesn’t immediately lead to self-sabotage via indignantly pretending scapegoating norms don’t exist.)

Not much to add on 3 and 4, except that my response to 2 bears on 3 as well. Strongly agree with:

In general, I have the instinct that pointing out that things *would be* technical errors if they were part of a proposed technical solution to the problem they claim to be solving, is a useful thing to do to help create common knowledge / knowledge.

I cannot speak for Zvi, but I suggest that the new thing is communication pollution.

Reality is far away and expensive. Signs are immediate and basically free. I intuitively suspect the gap is so huge that it is cheaper and easier to do a kind of sign-hopping, like frequency hopping, in lieu of working on or confronting the reality of the matter directly.

To provide more intuition about what I mean, compare communication costs to the falling costs of light over time. When our only lights were firewood it cost a significant fraction of the time of illumination in labor, for gathering and chopping wood. Now light is so ubiquitous that we turn them on with virtually no thought, and light pollution is a thing.

Interesting in this context that the Biblical version of the tower of Babel (as distinguished from e.g. the Babylonian account) was specifically constructed as a signal tower to overcome coordination difficulties due to large distances.

One (potential?) disagreement is that I think it's quite plausible that level-4-simulacrums are in fact the original morality, or co-evolved with level-1 morality. I think it actually took work to get morality to a point where it made any "sense" in a principled way. (At least, with principles that LWers are likely to endorse) My current best guess is that morality is rooted in two things: 1) the need to coordinate political factions (who has enough friends that they could beat someone and take their stuff, or avoid having themselves beaten-up-and-stuff-taken). Notions of 'fairness' (which come from the anger module), getting filtered through "what can a group of people agree is fair?", as a coordination mechanism. 2) something something repurposing our disgust module (from diseased individuals) to dislike people that seemed dangerous to have around. (So low status, powerless people often produce a disgust reaction. If you hang around a diseased person you might get sick. If you hang around powerless people you might get stuck with a spear). The oldest simulacrum-level-1 morality I can imagine would have involved coordinating hunters and maybe building shelters (where it matters how skilled people are). But I'd expect the same time period to already involve maintaining your position within a political tribe, and I'd expect higher-level-simulacra morality to already be at work in that context. (I'm not sure whether it makes sense to think of levels 1-through-4 as distinct stages) I'd expect the explicit level 1-4 transition to become relevant after we moved to hierarchical agricultural societies, but for that to be happening alongside levels 2-4 already existing in some form.
Coevolution seems plausible to me, but preexisting doesn't. Forager-typical fairness norms seem like a coherent shared social agenda, which is I think all that's required to be at simulacra level 1. The anger "module" is fundamentally social and seems to be object-level. Plenty of social animals not smart enough to be Machiavellian experience anger, a sense of fairness, etc.
It’s not too hard to see why people would benefit from joining a majority expropriating from a blameworthy individual. But why would they join a majority transferring resources to a praiseworthy one? So, being singled out is much more bad than good here.

This makes intuitive sense, but it doesn't seem to be borne out by modern experience; when coalitions attack blameworthy individuals these days, they don't usually get any resources out of it, the resources just end up destroyed or taken by a government that wasn't part of the coalition.

Not true; each member of the coalition responsible for destroying the enemy gains recognition as “one of the good people”, and temporary security from being branded as an enemy themselves.

If that's what people are getting out of it, it's symmetric, and they might as well join praise-gangs, so this fails to explain the asymmetry. You are disagreeing with Benquo just as much as Jimrandomh is.

If you praise one who is praised by many others, you might be doing it only to get with the “in” crowd, and that is worthless; it costs you nothing and it therefore signals nothing. But if you help to destroy one who is targeted by many others, it does not matter if others are also destroying him, then you incur the dual cost of ensuring the destruction of one of the enemy faction, and of marking yourself as being a foe of that enemy faction; these are costs, and thus make for a strong signal (that you are not one of Them).

OK, if praise-gangs don't actually do anything, while destruction gangs actually destroy, then praise-gangs are cheap talk. But that sounds to me like it's just pushing it back another level. Benquo claimed that there was an asymmetry in joining putatively effective gangs. If destruction is 10x as effective as creation, then maybe a pebble promoting creation should get 1/10 as much credit as a pebble promoting destruction.
signaling conformity, counter to beliefs, is not costless. Praise that is popular is evidence AT LEAST that conformity on this topic is more important to the judgment-expresser than unpopular blame. so some mix of "actual praise" and "complaint less important than conformity".

Thanks for pushing towards clarity here! I'm a bit confused about what you're saying, in part because I find the references in Said's comment a bit unclear (e.g. what exactly is implied by "recognition as 'one of the good people'"?). I also don't see how the "temporary security" paradigm works symmetrically. Would you be wiling to unpack this a bit?

6Said Achmiz5y
In the battle between Us and Them, you must continually prove that you are one of Us, lest we suspect that you are secretly with Them. Taking part on the destruction of one of Them is evidence that you are not yourself one of Them, as failing do so is evidence of the opposite; for who would not wish to destroy Them, but one of their own?
The double double double double cross, shows evidence of being one of Us, but actually being one of Them. Or, even better, Being both at once. The prestige, oh the prestige...

This is the sort of thing that seems increasingly unappealing, the less you're operating under the assumption that things are zero-sum within the relevant domain. I agree that this assumption is often false! And yet, many people seem to be acting on it in many contexts.

What do you mean by "modern experience"? If you mean things happening at new scales, like twitter mobs, probably game theory is not the right way to describe it, but accidental consequences of psychology adapted for smaller settings. Whereas I think Benquo is talking about smaller scales, like office politics, where the resources are near enough to seize. That may well explain irrational behavior at broader scales. (Although I think twitter mobs aren't that asymmetric.)

Endorse following that link above to simulacra level 1, for anyone following this.

One would think that it would also be powerful (at level 4) to create common knowledge of your *lack* of ability to interact with or help with a thing, which can be assisted by the creation of common knowledge blaming someone else. And in fact I do think we observe a lot of attempts to create common “knowledge” (air quotes because the information in question is often incomplete, misleading or outright false) about who is to blame for various things.

It is also reasonable in some sense, at that point, to put a large multiplier on bad things for which we establish common knowledge if we expect that most bad things do not become common knowledge, to the extent that one might be judged to be as bad as the worst established action.

Which in turn results in anything and anyone under sufficient hostile scrutiny, which has taken a bunch of action, to be seen as bad.

The Copenhagen Interpretation actually is perverse and is quite bad, whether or not it is a locally reasonable action in some cases for people on L-2 or higher.

One of the big advantages, to me, of TCI is that in addition to explaining specific behav... (read more)

Agree that the Copenhagen Interpretation of Ethics model is important in large part because it clarifies that most people are not computing a simulacrum level 1 morality. We’re going to need to be better about saying this explicitly, because the default outcome for posts like yours is to get interpreted as claiming that people really are just making an unmotivated technical error. I think that’s what happened with LessWrong, and we both know how that project failed. Tsuyoku Naritai!

I'm actually a bit confused about whether Copenhagen is automatically not Level 1 Simulacrum.  (also, I'm noticing that we're using multiple layers of jargon here and this whole conversation could use a distillation down into plain English, but for now will stay knee-deep in the jargon) Whether Copenhagen is perverse depends a bit on how reasonable it is to halfway solve a problem, or how suspicious it is to benefit from solving a problem. In todays world, problems are immense and complicated and you definitely want people making partial progress on them, and don't want to incentivize people to ignore problems. But this isn't obviously true to me among ancient hunter gatherers. (I don't currently have a clear model of what problems ancient hunter-gatherers actually faced, and how hard they were to fix, and so this isn't a place where I have a strong opinion much at all, just that the current arguments seem underjustified to me) I recall when my dad would get mad at me for mowing half the lawn. I'm not sure how to think about this. Obviously mowing half the lawn is better than mowing zero. But, his point was "Actually, it is not that hard to mow the whole god damn lawn. It is virtuous to finish things that you start. You (Ray) seem to be working yourself up into a sense that you've worked so hard and should get to stop when you just haven't actually worked that hard and you could finish the rest of the lawn in another 30 minutes and then the whole thing would be done." Whether this is reasonable or not depends on whether you think it's more important to get laws partially mowed, and whether you think my feeling of exhaustion after mowing half the lawn was legitimate, or a psychological defense mechanism for giving myself an excuse to stop an feel good about myself without having completed the entire job. (I don't actually know myself)
To answer the topline question I think that you can accept Copenhagen and still be on Level 1.  I like the lawn example because in many ways it is clean. There are a number of ways your dad can be right to get mad, and ways he can be wrong. 
Or, alternately: I'm not 100% sure what Level 1 Morality is supposed to mean here.

Noting that I also replied to Benquo's comments back at the original post (he posted them in both places): I will cross-post the 'first wave' of replies here but may or may not post subsequent waves should they exist.


I really like this post. I think it points out an important problem with intuitive credit-assignment algorithms which people often use. The incentive toward inaction is a real problem which is often encountered in practice. While I was somewhat aware of the problem before, this post explains it well.

I also think this post is wrong, in a significant way: asymmetric justice is not always a problem and is sometimes exactly what you want. in particular, it's how you want a justice system (in the sense of police, judges, etc) to work.

The book Law's Order explains it like this: you don't want theft to be punished in keeping with its cost. Rather, in order for the free market to function, you want theft to be punished harshly enough that theft basically doesn't happen.

Zvi speaks as if the purpose of the justice system is to reward positive externalities and punish negative externalities, to align everyone's incentives. While this is a noble goal, Law's Order sees it as a goal to be taken care of by other parts of society, in particular the free market. (Law's Order is a fairly libertarian book, so it puts a lot of faith in the free market.)

The purpose of the justice system is to enforce t... (read more)

I really like this post. I think it points out an important problem with intuitive credit-assignment algorithms which people often use. The incentive toward inaction is a real problem which is often encountered in practice. While I was somewhat aware of the problem before, this post explains it well.

Rereading this, one thought that comes to mind is that Copenhagen ethics and asymmetric justice may be another side of blackbox reinforcement learning driven by egalitarianism. Just as a CEO is held strictly responsible for everything that happens under them and is punished, regardless of whether we reasonably believe the bad results were not their fault, because we are insufficiently sure of judging fault and cannot observe all the actions the CEO did or did not do; or anyone who keeps a tiger in their backyard is held 100% responsible when that tiger eats someone no matter how much they swear they thought the fences were adequate; anyone who gets involved with a problem and doesn't meet some high bar is automatically assumed to be guilty, because we can't be sure they didn't do some skulduggery or gossip, so if they benefit in any way from the problem, we especially want to punish ... (read more)

I think you go too far by also postulating that (in the evolutionary past) it would be natural to assume that every game is zero-sum. There are clearly a lot of cooperative interactions in that kind of environment. Every interaction has a 'winner' and a 'loser' because of the focus on egalitarianism: the 'loser' is the one who got the worse end of the deal (according to the partly-understood, partly-hypothetical ideal of fairness). Ganging up on whoever keeps getting the best side of deals is a natural way to enforce fair splits. Which seems different from the involvement heuristic you mention. The involvement heuristic (EG, blame the CEO for anything the company does) has no obvious reason to be asymmetric. It seems dumb. If we're not sure how to assign credit, punishing everyone involved seems to go hand in hand with rewarding everyone involved. So I would still think the main reason for asymmetric justice is coordination around norms (such as fairness norms) that should almost always be followed. It doesn't make sense to reward people for fairness if almost everyone is supposed to be fair almost all of the time. It makes far more sense to punish the unfair. So, yeah, then when you couple that with the involvement heuristic... you get copenhagen-ethics. Sucks.

Top-level note that the last line of this post was previously "Let us at least strive to do better" and is now "Asymmetric systems of judgment are systems for opposing all action."

It was changed because people I respect took this as an indication that this was either in the call-to-action genre, or was a hybrid of the call-to-action and call-to-clarity genres, or was suggesting that this one action was a solution to the problem, or something. See Wei Dei's top-level comment and its thread for details.

It felt very Copenhagen Interpretation - I'd interacted with the problem of what to do about it and thus was to blame for not doing more or my solution being incomplete.

To avoid this distraction, it was removed with a wrapping-up line that doesn't do that. I am very worried about the forces that caused me to have to do that, and also at least somewhat worried about the forces that made me feel the need to include the line in the first place, and hope to have a post on such issues some time soon.

I am grateful that this was pointed out because it feels like it is pointing to an important problem that is getting worse.

7Wei Dai5y
I disagree with this framing. I think there's a difference between criticism (pointing out flaws in an idea or presentation or argument) and blame, and I was trying to engage in the former. I wrote a longer reply to one of your comments trying to explain this more but then deleted it because I feel like disengaging at this point. Initially I was just confused about what the conclusion of the post was trying to say and posted a comment about that, which drew me into a more substantive debate, and on reflection I don't think this is actually a debate that I need to be involved in.

I think this is actually extremely important, but in a subtle way that's very easy to get wrong, so I'm not sure I disagree with your choice to locally disengage.

I agree that Zvi made a technical error in the conclusion, in a way that reliably caused misinterpretation towards construing things as calls to action, and that it was good to point this out. Nothing amiss here.

But, the fact that this minor technical error was so important relative to the rest of the post is, itself, a huge red flag that something is wrong with our discourse, and we should be trying to figure that out if we think something like FAI might turn out to be important.

2Wei Dai5y
This summary seems wrong or confused or confusing to me. 1. What is the actual error you have in mind? (I myself have made a couple of different criticisms about the post but I'm not sure any of them fits your description of "minor technical error" that "reliably caused misinterpretation towards construing things as calls to action".) 2. "Call to action" is apparently a loaded term with negative connotations among the mods and perhaps others here (which I wasn't previously aware of). Are you using it in this derogatory sense or some other sense? 3. Zvi himself has confirmed that his original conclusion was intended as a call to action, albeit an "incidental" one. Why do you keep saying that there wasn't a call to action, and that "call to action" is a misinterpretation? I believe there have been several different layers of confusion happening in this episode (and may continue to be happening), which has contributed to the large number of comments written about it and maybe a sense that it's more important than the rest of the post. Also, again, depending on exactly what you mean, I'm not sure I'd agree with "minor technical error". It seems like some of my own criticisms of the post were actually fairly substantial and combined with the aforementioned confusions and the fact that disagreements will naturally generate more discussion than agreements, I don't understand why you think there is a "huge red flag that something is wrong with our discourse" here. I wanted to disengage as I'm not sure continuing to participate in this debate (including retrying to fully resolve all the layers of confusion) is the best use of my time, but I'm happy to listen to you explain more if you still think this is actually important or has relevance to FAI.
I think the "call to action" issue is important for bigger reasons than LessWrong's governance, but I'll taboo the phrase for now. It seems to me like the default paradigm, including in Rationalist circles, has increasingly become the following: Words are not communicative unless they are commands. Anything that does not terminate in a command, a "pitch," or something in that class, is construed as therefore unclear. The relevance to FAI is that any group trying to design one (or really design anything substantively new from first principles) needs to be able to have internal communication that is really, really robustly not made out of telling each other to do specific things, and it seems like the default expectation, including in Rationalist circles, has increasingly become that words are not communicative unless they are commands. None of the work people were doing several years ago on decision theory was like this. Here's why I interpreted Zvi's rhetoric as a technical error. In another comment, when I asked you: You replied: I took this to mean that but for this sentence (which I took to be a superfluous conclusion-flavored end, and Zvi agrees wasn't part of the core content of the post), you wouldn't have focused on the question of what specific actions the post was asking the reader to perform. Was I misreading you? If so, what did I get wrong? I want to check on that before I say more.
8Wei Dai5y
I haven't seen this myself. If you want, I can point you to any number of posts on the Alignment Forum that are not made out of telling each other to do specific things. Can you give some examples of what you've seen that made you say this? Again, I'm not really seeing this now either. Probably, but I'm not totally sure. I guess unless the (counterfactual) conclusion said something to the effect of "This seems bad, and I'm not sure what to do about it" I might have asked something like "The models in this post don't seem to include enough gears for me to figure out what I can do to help with the situation. Do you have any further thoughts about that?" And then maybe he would have either said "I'm still trying to figure that out" in which case the conversation would have ended, or maybe he would have said "I think we should try not to use asymmetrical mental point systems unless structurally necessary" and then we would have had the same debate about whether that implication is justified or not. (I'm not sure where this line of question is leading... Also I still don't understand why you're calling it a "technical" error. If the mistake was writing a superfluous conclusion-flavored end, wouldn't "rhetorical" or "presentation" error be more appropriate? What is technical about the error?)
My understanding from personal experience and the reports of people at MIRI is that MIRI isn't even using very basic decision theory or AI alignment results in practice. I'm not doubting that people are still participating in a kind of dissociated discourse that doesn't affect actions, and separately that they do a thing where they try to use words to compel actions from others. The problem is that the former seems to be increasingly just for show, and not predictive of behavior the way you'd expect if stated preferences and models were accurate.
Fair enough, I don't think this needs to go deeper. I agree this was criticism rather than blame. I got more frustrated than I should have been in this spot as I explained exactly what I was thinking at the time, and this seemed to be making things worse by creating a clearer target, or something. I dunno.
The parent of this comment says that the last line of this post which (as of 2019-06-05) is right now the last line of the post. There's discussion elsewhere about a last line saying something like "Let us all be wise enough to aim higher". Would I be right in guessing that that is the last line that was removed, and that the comment above has merely transcribed the wrong text? Or am I more deeply confused than I think?
Huh. Funny no one caught that until now. Edited.

Excellent post overall. I want to comment on one interesting bit. Zvi describes the “asymmetric” system as having this feature, among others:

Some of the indirect consequences of buying a tomato are good. You don’t get credit for those unless you knew about them, because all you were trying to do was buy a tomato. Knowing about them is possible in theory, but expensive, and doesn’t make them better. It only makes you know about them, which only matters to the extent that it changes your decisions.

It’s worth asking: should we (who would like to improve the system) reject this aspect, in particular? That is: should people get “moral credit” for indirect, good consequences of their actions (even when they are unlikely to have known about them)?

I say: yes.

I can see two reasons for taking this view.

First, even if you didn’t know about a negative, indirect consequence of one of your actions, you should incur moral blame for it (not necessarily much moral blame, but some—scaled by just how indirect the consequence was, etc.), because you could have known about the negative consequences—and we would not wish to let you off the hook in that case, merely due to plausible deniability (as t

... (read more)
No wonder people make seemingly absurdist statements like “there is no ethical consumption under capitalism.”

The statement might be absurdist but it's not itself an absurd claim (which is what I take you to be implying). It's a claim that there exists no consumption pattern under capitalism that doesn't involve participating in the infliction of harm on others. You can't be a private citizen minding your own business. This means that there's an affirmative duty to help make the system better, since supposed neutrality is actually just unremediated complicity.

This is correctly seen as a moral emergency which breaks down "normal" peacetime systems of ethics, because there is a war. But of course the focus on whether there is or isn’t ethical consumption (i.e. the binary of “blameworthy” and “blameless”) privileges the blame-oriented asymmetry that comes from the corruption of simulacra level 4 scapegoating games. Seems wrong to say people shouldn’t use the words they have to try to point to important things, even if the words are too corrupted to have adequate expressive power to just explicitly say the things.

I say seemingly absurd to point out that, to my and many other ears, the statement seems upon first encounter to be absurd. And of course, the idea that it can’t be ethical to consume anything at all in any way at all, when lack of at least some consumption is death, does seem like it’s allowed to be absurd. Of course, also: Some absurd things are true!

I also think it is very wrong, that even the default consumption pattern is ethical as I see things (although not some other reasonable ways of seeing things), and that an engineered-to-be-ethical one is ethical under the other reasonable ways as well, such that for any given system there exists such an engineered method.

This is because I don’t think it is reasonable to apply different order-of-magnitude calculations on second and higher order benefits and harms from actions in complex systems, and I have a much more benign view of those higher order effects than those making this statement. The main error is upstream of the statement.

That doesn’t mean one doesn’t have an affirmative duty to work to make things better, somewhere, in some way. But one must structure that as the ability for actions to be good, and the best score to not be zero (e.g. the perfect person isn’t the person who fails to interact with the system).

[This discussion in particular risks going outside LW-appropriate bounds and so should likely be continued on the blog, if it continues]

Just wanted to say I appreciate the efforts to keep things LW appropriate.

Also, my ideal is for ‘LW-appropriate‘ to be... like... actually a good way of conducting intellectual discourse, and insofar as that is (unnecessarily) preventing important conversations from happening publicly, it's something I'd want to fine tune.

Earlier today I said at the LW office "I think the things Zvi and Ben have been saying lately are pretty important and if they're not currently in a state that we'd be happy having them on frontpage, we should probably put in some effort to help them become so."

I'll try to keep my reply here within bounds. I think the steelman I'm pointing to is often what people are trying to say, using corrupted language with inadequate expressive power (at their level of verbal skill and privilege / allotted airtime). I think this general pattern is important to be aware of. Related, comparatively unpoliticized example:

Note that written codes (including both law and moral theorizing) are, per Godel, incomplete and/or contradictory. It's no surprise that common laws and armchair theories of "justice" focus on punishment for disruption rather than reward for cooperation, as they are _ALL_ based on an unstated theory that inaction is impossible or unrewarding, and the normal state is for people to do good things and be rewarded naturally for them. Interventional justice (codified and administered by humans) is mostly concerned with deviation from norm.

The first proto-law is "don't be weird", which includes both positive and negative weirdness. Only after some thought, scale, and evolution of systems does it become "don't do these things", a purely negative injunction.

In what we will call the Good Place system (…) If you take actions with good consequences, you only get those points if your motive was to do good. (…) You lose points for bad actions whether or not you intended to be bad.

See also: Knobe effect. People seem also seem to asymetrically judge whether your action was intentional in the first place.

In a study published in 2003, Knobe presented passers-by in a Manhattan park with the following scenario. The CEO of a company is sitting in his office when his Vice President of R&D comes in and says, ‘We are thinking of starting a new programme. It will help us increase profits, but it will also harm the environment.’ The CEO responds that he doesn’t care about harming the environment and just wants to make as much profit as possible. The programme is carried out, profits are made and the environment is harmed.

Did the CEO intentionally harm the environment? The vast majority of people Knobe quizzed – 82 per cent – said he did. But what if the scenario is changed such that the word ‘harm’ is replaced with ‘help’? In this case the CEO doesn’t care about helping the environment, and still just wants to make a profit – and his actions result in both outcomes. Now faced with the question ‘Did the CEO intentionally help the environment?’, just 23 per cent of Knobe’s participants said ‘yes’ (Knobe, 2003a).

Promoted to curated: I think there is something really important in the Copenhagen Interpretation of Ethics, and this post expands on that concept a bunch of important ways. I've ended up referring back to it a bunch of times over the last month, and I've found that it has significantly changed my models of the global coordination landscape.

I don’t think it’s actually true that the Babylonians only had expensive housing. Architects lived with some risk of death due to their buildings falling down, just like the people who lived in houses or walked across bridges.

I am curious if that line ever actually got enforced.

I don’t think that, in practice, houses collapse all that often, or that preventing that is that expensive. So it’s more like (I’m completely guessing, I know nothing else about Babylonian architecture), there was more of an emphasis on things that don’t fall down over other properties. What you do is ban flimsy housing, but the main cost of housing lies elsewhere.

Too often we assign risk without reward.

Sometimes we assign too little risk though. Owen Cotton-Barratt made this point in Why daring scientists should have to get liability insurance. Maybe assigning too much risk is worse by frequency, but assigning too little risk is worse by expected impact. In other words a few cases of assigning too little risk, leading to increased x-risk, could easily overwhelm many cases of "assign risk without reward."

Also this post doesn't seem to go into the root causes of "Too often we assign risk without reward." which lea

... (read more)

I think of requiring scientists to get liability insurance as actually an example of the problem - a scientist that makes a breakthrough will probably capture almost none of the benefits (as a percentage of total generated surplus) even if it makes them famous and well-off. Even a full patent grant is going to be only the short-term monopoly profits.

Whereas a scientist who makes a series of trivial advances allowing publication of papers might often capture more than all of the net benefits, or there might not even be net benefits. Thus, one of several reasons for very few attempts at breakthroughs. If you allowed better capture of the upside then it would make sense to make them own more downside.

I do agree that we also have situations where the reverse happens.

The intention of the last line was, avoid using asymmetric mental point systems except where structurally necessary, and be-a-conclusion. But the intention was to inform people and give a word to a concept that I could build upon, primarily, rather than a call for action.

It is important that calls for clarity without calls for action not be seen as failures to carefully elaborate a call for action. And in fact LW explicitly favors informing over calls for action and I've had posts (correctly) not promoted to main because they were too much of a call-for-action.

2Wei Dai5y
I thought Owen made a good case in the podcast that we currently have more mechanisms in place to fix/workaround the "insufficient capture of the upside" problem than the "insufficient capture of the downside" problem, as far as scientific research is concerned. (See also the related paper.) I would be interested to see the two of you engage each other's arguments directly. Do you have an explanation of why we currently are often using asymmetric mental point systems when it's not structurally necessary? My general expectation is that when it comes to deficiencies in human group rationality, there are usually economic / game theoretic reasons for them to exist, so you can't fix it by saying "just don't do that".
First point: Is it worth the bandwidth to get into the weeds on this? To me, saying "we currently have mechanisms with which to solve X" matters little if X is not being solved in this way. I certainly don't see how 'put all the downside on the researcher' could possibly be matched, since you're certainly not going to give them most or all of the upside - again we don't even come close to doing that for drugs that can be sold at monopoly prices, and that's before giving everyone along the way their cuts. Second: I have at least some reasons, of varying degrees of being good reasons. The best reason I can think of for why it is good, would be that it opens the door for lots of larger manipulations, and might put even greater burdens on people to constantly point out the good things they're doing to collect all the points from them to offset where they get docked or otherwise score highly. Whereas now you only have to avoid bad things being pointed out. Or alternatively, that when people claim good things they have obvious bad incentives to do that, so you're inclined to not believe them. And that we don't have time to find all the context, and need to act on simple heuristics due to limited compute. And in some places, the willingness to *ever* do a sufficiently bad thing is very strong evidence of additional bad things, and we need to maintain a strong norm of always punishing an action to maintain a strong norm against that action. Also potentially important is that if you let things get fuzzy, those with power will use that fuzziness to enhance their own power. When needed, they'll find ways to give themselves points to offset any bad things they're caught doing. You need a way to stop this and bring them down. And so on. So in some places it becomes structurally necessary to have a no-excuses (or only local and well-specified excuses like self defense) approach. But there are entire cultural groups who use this as the generic evaluate-thing algorithm and th

So far, the adverse impact of scientific research has mostly been through enabling the construction of more powerful weapons and information-processing tools for states to use in war and similar enterprises. There's no neutral "we" to assess liability here, only the powerful actors responsible for causing the direct harms in the first place! Asking states to assign themselves yet more power by prospectively punishing scientists for thinking, without assigning some corresponding risk to the state actors or opinion-generators coming up with such proposals, doesn't seem like it could plausibly improve the relative assignment of risk and power.

What additional personal risk is Owen taking on by (implicitly) arguing for increased central control of idea-propagation, beyond that borne by innocent bystanders? This is a proposal that has already worked out very poorly for very many people in the past.

I'm not saying Owen should under current circumstances bear that risk, but I am saying that any such assignment of risk needs to be in the context of a systematic and symmetrical evaluation of risks rather than ad-hoc, if we want to have any reasonable hope that it's more helpful than harmful.

What's the model whereby a LessWrong post ought to have a "takeaway message" or "call to action"? If an argument/explanation elucidates the structure of reality in ways that are important for understanding a class of things of which the conclusion is a member, then we can't summarize the value with the conclusion! If it's not important for that sort of understanding, then it's just a soldier. It reads to me like you're complaining that Zvi's post is insufficiently mindkilled and therefore confusing. I'm perplexed by this; you've written a lot on LessWrong that's been helpful and insightful without a clear "takeaway" or single specific action implied, e.g. on decision theory. Zvi's post seems like it's in the analysis genre, where an existing commonly represented story about right action is critiqued. Pointing out common obvious mistakes, and trying to explain them and distinguish them from nearby unmistaken stories, is really important for deconfusion.

What’s the model whereby a LessWrong post ought to have a “takeaway message” or “call to action”?

I was trying to figure out what "Let us all be wise enough to aim higher." was intended to mean. It seemed like it was either a "takeaway message" (e.g., suggestion or call to action), or an applause light (i.e., something that doesn't mean anything but sounds good), hence my question.

Zvi’s post seems like it’s in the analysis genre, where an existing commonly represented story about right action is critiqued.

I guess the last sentence threw me, since it seems out of place in the analysis genre?

I also see, looking back upon it now, that this was kind of supposed to be a call for literally any action whatsoever, as opposed to striving to take as little action as possible. Or at least, I can read it like that quite easily - one needs to not strive to be the 'perfect' person in the form of someone who didn't do anything actively wrong.

Which would then be the most call-to--action of all the calls-to-action, since it is literally a Call To Action.

So, yeah. There's that. In terms of what I was thinking at the time, I'll quote my comment above: Your reaction points out a way this could be bad. By taking a call-for-clarity piece, and finishing it with a sentence that implies one might want to take action of some kind, one potentially makes a reader classify the whole thing as a call-to-action. Which is natural, since the default is to assume calls-for-clarity are failed calls-for-action, because who would bother calling for clarity? Doesn't seem worth one's time. Which means that such things might indeed be quite bad, and to be avoided. If people end up going 'oh, I'm being asked to do less X' and therefore forget about the model of X being presented, that's a big loss. The cost is twofold, then: 1. It becomes harder to form a good ending. You can't just delete that line without substituting another ending. 2. If we can't put an incidental/payoff call to implied action into an analysis piece, then the concrete steps this suggests won't get taken. People might think 'this is interesting' but not know what to do with it, and thus discard the presented model as unworthy of their brain space. Which means this gets pretty muddled and it's not obvious which way this should go.
6Wei Dai5y
My main complaint was that I just couldn't tell what you were trying to say with the current ending. If you're open to suggestions, I'd replace the last few lines with something like this instead: If this analysis is correct, it suggests that we should avoid using asymmetric mental point systems except where structurally necessary. For example, the next time you're in situation ..., consider doing ... instead of ... ETA: It's not clear to me whether you're saying A) people ought to keep this model in their brain even if there was no practical implication, but they won't in practice, so you had to give them one, or B) it's reasonable to demand a practical implication before making space for a model in one's brain which is why you included one in the post. If the latter, it seems like the practical implication / call to action shouldn't just be incidental, but significant space should be devoted to spelling it out clearly and backing it up by analysis/argument, so that if it was wrong it could be critiqued (which would allow people to discard the model after all).
Good point - that rhetorical flourish implies a call to action when there isn’t one.
Fwiw – I read this post, and thought "hmm, this post does two things – it puts forth some fairly concrete models (which are interesting independent of any call to action). It also puts forth... some kind of vague call to action, which includes a bit more rhetoric than I'm comfortable with, but not so much more that I think it shouldn't be frontpaged given that the models seem straightforwardly good. So... basicIy I didn't come away with this post with a call-to-action, I just came up with a useful handle for how to think about one aspect of justice, which I'll have in mind as I go around thinking about justice.

Fine. I'm convinced now. The line has been replaced by a summary-style line that is clearly not a call to action.

The pattern seems to be, if one spends 1600 words on analysis, then one sentence suggesting one might aim to avoid the mistakes pointed out in the analysis, then one is viewed as "doing two things" and/or being a call to action, and then is guilty if the call-to-action isn't sufficiently well specified and doesn't give concrete explicit paths to making progress that seem realistic and to fit people's incentives and so on?

Which itself seems like several really big problems, and an illustration of the central point of this piece!

Call to action, and the calling thereof, is an action, and thus makes one potentially blameworthy in various ways for being insufficient, whereas having no call to action would have been fine. You've interacted with the problem, and thus by CIE are responsible for not doing more. So one must not interact with the problem in any real way, and ensure that one isn't daring to suggest anything get done.

I'm viewing this thing more through the lens of Tales of Alice Almost, where there's a legitimate hard question of "what should be incentivized on LessWrong", which depends a lot on what the average skills and tendencies of the typical LessWrong user is, as well as what the skills/tendencies of particular users are. Longterm, there's a quite high bar we want LessWrong to be aspiring to. Because newcomers frequently arrive at LessWrong who won't yet have a bunch of skills, there needs to be some fairly simple guidelines to get them started (allowing them to get positively rewarded for contributing). But I do want the tail end of users to also have incentive to continue to improve. Because I'm not 100% sure what the right collection of skills and norms for LessWrong to encourage are, there also needs to be an incentive for the collective culture (and the mod team in particular) to improve our understanding of "what things should be incentivized" so we don't get stuck in a weird lost purpose. (If the current mod team got hit by a truck and new people took over and tried to implement our "no calls to action on frontpage" rule without understanding it, I predict they wouldn't get the nuances right). Posts by Zvi are reliably much more interesting to me than the average post, tackling issues that are thorny with interesting insight that I respect quite a bit. If the collection of incentives we had resulted in Zvi posting less, that would be quite bad. But Zvi posts also tend to be include a particular kind of rhetorical flourish that feels out of place for LessWrong – it feels like I'm listening to a political rally. So a) I don't want new users to internalize that style as something they should emulate (part of what the frontpage is for), and b) I genuinely want the frontpage to be a place where people can engage with ideas without feeling incentivized to think about those ideas through the lens of "how is this affecting the social landscape?" (this is not because


I did change the post on the blog as well, not only the LW version, to the new version. This wasn't a case of 'I shouldn't have to change this but Raemon is being dense' but rather 'I see two of the best people on this site focusing on this one sentence in massively distracting ways so I'm clearly doing something wrong here' and reaching the conclusion that this is how humans read articles so this line needs to go. And indeed, to draw a clear distinction between the posts where I am doing pure model building, from the posts with action calls.

I got frustrated because it feels like this is an expensive sacrifice that shouldn't be necessary. And because I was worried that this was an emergent pattern and dilemma against clarity, where if your call to clarity hints at a call to action people focus on the call to action, and if you don't call to action then people (especially outside of LW) say "That was nice and all but you didn't tell me what to do with that so what's the point?" and/or therefore forget what said. And the whole issue of calls to action vs. clarity has been central to some recent private discussio... (read more)

2Wei Dai5y
When did this rule come into effect and where is it written down? The closest thing I can find in Frontpage Posting and Commenting Guidelines is: Which seems pretty far from “no calls to action on frontpage” and isn't even in the "Things to keep to a minimum" or "Off-limits things" section. (If I had been aware of this rule and surrounding discussions about it, maybe I would have been more sensitive about "accusing" someone of making a call to action, which to be clear wasn't my intention at all since I didn't even know such a rule existed.)
I think the phrase "call to action" might get used internally more than externally (although I have a blogpost brewing that delves into it a bit, as well as another phrase "call to conflict.") But a phrase used in both our Frontpage Commenting guidelines, and on the tooltip for when you mark a post as 'allow moderators to promote' is 'aim to explain, not persuade', where calls to action are a subset of persuading. (Note that both of those site-elements might not appear on GreaterWrong. I think GreaterWrong also doesn't really have the frontpage distinction anyhow, instead just showing all new posts in order of appearance)
I actually think the "aim to explain, not persuade" framing is generally clearer than the "no call to action" framing. Like, if you explain something to someone that strongly implies some action, then some people might call that a "call to action" but I would think that's totally fine.
6Wei Dai5y
Agreed. And I think I was implicitly focusing on whether the post gave a sufficient explanation for its (original) conclusion, and was rather confused why others were so focused on whether there was a call to action or not (which without knowing the context of your private discussions I just interpreted to mean any practical suggestion)
So, this post has netted for Zvi a few hundred karma, which SEEMS to be encouraging the right thing. Even with some confusion and controversy, it's clearly positive value. I apologize for my asymmetric commenting style, especially if my focus on points of disagreement makes it seem like I don't value the topic and everyone's thoughts on it. I want to ask about your dual preferences: you want high-quality as an absolute and you want people to improve from their current capabilities, as a relative. Are there different ways of encouraging these two goals, or are they integrated enough that you think of them as the same?
No need to apologize for focusing on points of disagreement. And I'm grateful for the commentary and confusion, because it pointed to important questions about how to have good discourse and caused me to notice something I do frequently that is likely a mistake. It's like finally having an editor, in the good way. I'm not on the moderation team, but my perspective is that the two goals overlap and are fully compatible but largely distinct and need to be optimized for in different ways (see Tale of Alice Almost). And this is the situation in which you get a conflict between them, because norms are messy and you can't avoid what happens in hard mode threads bleeding into other places.
2[comment deleted]5y
4Wei Dai5y
Part of my complaint was that the models didn't seem to include enough gears for me to figure out what I could do to make things better. The author's own conclusions, which he later clarified in the comments, seems to be that we should individually do less of the thing that he suggests is bad. But my background assumption is that group rationality problems are usually coordination problems so it usually doesn't help much to tell people to individually "do the right thing". That would be analogous to telling players in PD to just play cooperate. At this point I still don't know whether or why the author's call to action would work better than telling players in PD to just play cooperate.

I am confused why it is unreasonable to suggest to people that, as a first step to correcting a mistake, that they themselves stop making it. I don't think that 'I individually would suffer so much from not making this mistake that I require group coordination to stop making it' applies here.

And in general, I worry that the line of reasoning that goes " group rationality problems are usually coordination problems so it usually doesn't help much to tell people to individually "do the right thing" leads (as it seems to be doing directly in this case) to the suggestion that now it is unreasonable to suggest someone might do the right thing on their own in addition to any efforts to make that a better plan or to assist with abilities to coordinate.

I'd also challenge the idea that only the group's conclusions on what is just matter, or that the goal of forming conclusions about what is just is to reach the same conclusion as the group, meaning that justice becomes 'that which the group chooses to coordinate on.' And where one's cognition is primarily about figuring out where the coordination is going to land, rather than what... (read more)

I am confused why it is unreasonable to suggest to people that, as a first step to correcting a mistake, that they themselves stop making it.

My reasoning is that 1) the problem could be a coordination problem. If it is, then telling people to individually stop making the mistake does nothing or just hurts the people who listen, without making the world better off as a whole. If it's not a coordination problem, then 2) there's still a high probability that it's a Chesterton's fence, and I think your post didn't do enough to rule that out either.

now it is unreasonable to suggest someone might do the right thing on their own in addition to any efforts to make that a better plan or to assist with abilities to coordinate

Maybe my position is more understandable in light of the Chesterton's fence concern? (Sorry that my critique is coming out in bits and pieces, but originally I just couldn't understand what the ending meant, then the discussion got a bit side-tracked onto whether there was a call to action or not, etc.)

I’d also challenge the idea that only the group’s conclusions on what is just matter, or that the goal of forming conclusions about what is just is to reach the s

... (read more)

As I noted in my other reply, on reflection I was definitely overly frustrated when replying here and it showed. I need to be better about that. And yes, this helps understand where you're coming from.

Responding to the concerns:

1) It is in part a coordination problem - everyone gets benefits if there is agreement on an answer, versus disagreement among two equally useful/correct potential responses. But it's certainly not a pure coordination problem. It isn't obvious to me if, given everyone else has coordinated on an incorrect answer, it is beneficial or harmful to you to find the correct answer (let's ignore here the question of what answer is right or wrong). You get to get your local incentives better, improve your map and understanding, set an example that can help people realize they're coordinating in the wrong place, people you want to be associating with are more inclined to associate with you (because they see you taking a stand for the right things, and would be willing to coordinate with you on the new answer, and on improving maps and incentives in general, and do less games that are primarily about coordination and political group dynamics...)... (read more)

(There is a closing quote missing in the second paragraph of this comment, which caused me to be quite confused reading that paragraph)
Part of my complaint was that the models didn't seem to include enough gears for me to figure out what I could do to make things better.

I do think it's fine to discuss models that represent reality accurately, while not knowing what action-relevant implications they might have eventually. A lot of AI-Alignment related thinking is not really suggesting many concrete actions to take, besides "this seems like a problem, no idea what to do about it".

I do not think we have no idea what to do about it. Creating common knowledge of a mistake, and ceasing to make that mistake yourself, are both doing something about it. If the problem is a coordination game then coordination to create common knowledge of the mistake seems like the obvious first move.

4Wei Dai5y
I think this is fine if made clear, but the post seemed to be implying (which the author later confirmed) that it did offer action-relevant implications.
FWIW, in slightly different words than my last comment, I agree with this criticism of this post.

NB: the link to the original blog on the Copenhagen Interpretation of Ethics is now broken and redirects to a shopping page.

Pretty much the best thing ever.

This post seems helpful in that it expands on the basic idea of the copenhagen interpretation of ethics, and when I first read it was modestly impactful to me, though it was mostly a way to reorganize what I already knew from the examples that Zvi uses. 

It seems to be very accurate and testable, through simple tests of moral intuitions? 

I would like to see more expanding on the conditions that get normal people out of this frame of mind, about suprising places that it pops up, and about realistic incentive design that can be used personally to get this to not happen in your brain.

Robin Hanson's Taboo Gradations (which was written after this post) seems related in that it's also about a non-linearity in our mental accounting system for social credit/blame. Might be a good idea to try to build a model that can explain both phenomena at the same time.

Robin seems to have run smack into the reasonably obvious "slavery is bad, so anything that could be seen as justifying slavery, or excusing slavery, is also bad to say even if true" thing. It's not that he isn't sincere, it's that it seems like he should have figured this one out by now. I am confused by his confusion, and wish he'd spend his points more efficiently.

The Asymmetric Justice model whereby you are as bad as the worst thing you've done would seem to cover this reasonably well at first glance - "Owned a slave" is very bad, and "Owned a slave but didn't force them into it" doesn't score a different number of points because "Owned a slave" is the salient biggest bad in addition to or rather than "Forced someone into slavery."

There's also the enrichment that, past a certain point, things just get marked as 'evil' or 'bad' and in many contexts, past that point, it doesn't matter, because you score points by condemning them and are guilty along side them if you defend them, and pointing out truth counts as defending, and lies or bad arguments against them count as condemning. But that all seems... elementary? Is any of this non-obvious? Actually asking.

9Ben Pace5y
Pretty sure the only interesting thing here is twitter and how it puts different cultures with different ideas of what count as norm violations into a big room with each other and how this doesn’t lead to tolerance but instead leads to interminable anger and slap-downs, due to enough people thinking their own norms are ‘obvious’ and not ‘optimised for a particular environment’. Friend groups and scientists and journalists and businesspeople applying their areas’ norms to each other 100% of the time? Ugh.

I think I _DO_ subscribe to a version of the Copenhagen Interpretation of Ethics. You are (and each agent is) responsible* for everything you/they perceive. Whatever situation you find yourself in, and whatever actions (including inaction) you take, you will feel some reflection of the pain you perceive in others, and that is the primary consequence (for you) of your choices (or rather, situation + choices - they're not easily separated).

I do use "responsible" in a much more limited way than many advocates of the concept of "justice&... (read more)

I think what you are pointing at is more heroic responsibility, unless you think that being unaware of something by choice actually lets you off the hook. I'm guessing you think it doesn't? If you think it does then say more. The Good Place's ability to assign (at least in my book) shockingly accurate point totals to actions is the best case for the existence of objective morality I've ever seen, but yes we're all fully aware it is fiction. I'm using it as a way to illustrate a mode of thinking, and to recommend a great show, nothing more.
If one is even slightly curious about the world, it's very hard to be unaware of suffering by choice. I don't have much of a theory of morality for the un-curious. And I do include "reasonably inferred" as suffering that you will share in your perception, so deniability doesn't let you off the hook (and my version isn't about other's judgement of your reasons anyway, it's about your actual experiences and choices).
Note that this does imply that I bite a pretty large bullet: I am probably a deka-hitler, possibly more. I'm also some fraction of a Salk. These are different dimensions, so don't cancel out - I have to live with the knowledge of all the suffering I haven't alleviated, even while feeling some relief from the good I've done.

This pointed out a fallacy in my own (subconcious) thinking, and inspired me to correct it.

I would suggest that this is ameliorated by the following:

  1. Nobody actually believes that you are to blame for every bad consequence of things you do, no matter how indirect. A conscientious person is expected to research and know some of the indirect consequences of his actions, but this expectation doesn't go out to infinity.

  2. While you don't get credit for unintended good consequences in general, you do get such credit in some situations. Specifically, if the good consequence is associated with a bad consequence, you are allowed to get credit for th

... (read more)

>The symmetric system is in favor of action.

This post made me think how much I value the actions of others, rather than just their omissions. And I have to conclude that the actions I value most in others are the ones that *thwart* actions of yet other people. When police and military take action to establish security against entities who would enslave or torture me, I value it. But on net, the activities of other humans are mostly bad for me. If I could snap my fingers and all other humans dropped dead (became inactive), I would instrumentally be bette... (read more)

This post identifies an interesting facet of how most people's conception of justice works.

If saving nine people from drowning did give one enough credits to murder a tenth, society would look a lot more functional than it currently is. What sort of people would use this mechanism.

1)You are a competent good person,who would have gotten the points anyway. You push a fat man off a bridge to stop a runaway trolley. The law doesn't see that as an excuse, but lets you off based on your previous good work.

2)You are selfish, you see some action that wouldn't cause too much harm to others, and would enrich yourself greatly (Its harmful enough... (read more)

Responding to old post: 1. You are a competent good person who would have gotten the points anyway. But since you are not immune to human error despite being a generally competent person, you do something which you perceive as necessary for the general good, but which actually, on the balance of things, causes harm. The law lets you off for this based on your good work. It's too easy to be a "good person" in general but prone to bias in a small area. 1. You are selfish in some way that doesn't pattern-match to "selfish about every single thing", so you would do good regardless of the law, but the law means you can also do some evil. I can imagine a doctor who would heal people for no reward other than his salary, but who might get stressed or frustrated and hurt people if he could do so without consequences. Or a white supremacist who would help fellow white people regardless of whether it benefitted him personally, but who might also beat up a couple of minorities on the side if the law permitted it.
2Donald Hobson2y
Fair point. 
2[comment deleted]2y

After some thought, I think my main objection (or at least concern - it's not really objectionable) to this line of thought is that it's analyzing a very small part of one's utility function. I don't know if it's more important to most than to me, but I care only a little bit about point systems and current outrage culture. My friends and coworkers don't seem to follow the pattern you describe either - they seem to like me regardless of whether I'm touching and not solving hard problems, or just playing games with them.... (read more)

If the carpenter’s son is executed when the house they built falls down and kills someone’s son, as in the Code of Hammurabi, well, that’s one way to ban inexpensive housing.

I thought the bridge example captured the problem of price very well, but this one seems different to me because it seems like it effectively advocates for houses falling down on people. The Code of Hammurabi is famously and literally symmetric, a strong example of lex talionis. If killing someone's son does not cause the carpenter to lose his, what does symmetric justice suggest?

I'm actually going to remove the example as unneeded, as it's caused two distinct comments one of which pointed out it's not working right and one of which challenged its assumptions. It's a distraction that isn't worth it, and a waste of space. So thank you for pointing that out.

To respond directly, one who takes on a share of tail risk needs to enjoy a share of the generic upside, so the carpenter would get a small equity stake in the house if this was a non-trivial risk. Alternatively, we could simply accept a small distortion in the construction of houses in favor of being 'too safe' and favoring carpenters who don't have children. Or we could think this punishment is simply way too large compared to what is needed to do the job.

This helps a lot - I think that more explicit emphasis on risk and reward needing to be symmetric in both type and shape in addition to magnitude would help a lot. Edit: would help a lot for the symmetric justice argument, I should have said. Although a casual introspective review of my conversations about risk says it would be a good idea for all such discussions. I will develop a habit of being explicit about the type and shape (which is to say distribution) of risks moving forward.

But what if A works with B and sees that B didn't go all the way they could to solve a problem? It happens all the time. CIE doesn't force A to peck B's brains out for acting badly; A is under no obligation to hand out punishment - at least if they do work together.

I'm not quite sure how I want to react here. Clearly there are some important aspects and a good intellectual inquiry and analysis will offer insights. On the other side I have this whisper in the back of my mind saying "Isn't a lot of this too much like the how many angels can dance on a pin head discussion?" (Note, this is from reading the post and some comments -- not the recommended source link...but that is inaction so I should be safe right ;-)

In a more serious note (but feeding into the pin head aspect I think) I don't see h... (read more)