Consequentialism traditionally doesn't distinguish between acts of commission or acts of omission. Not flipping the lever to the left is equivalent with flipping it to the right.

But there seems one clear case where the distinction is important. Consider a moral learning agent. It must act in accordance with human morality and desires, which it is currently unclear about.

For example, it may consider whether to forcibly wirehead everyone. If it does so, they everyone will agree, for the rest of their existence, that the wireheading was the right thing to do. Therefore across the whole future span of human preferences, humans agree that wireheading was correct, apart from a very brief period of objection in the immediate future. Given that human preferences are known to be inconsistent, this seems to imply that forcible wireheading is the right thing to do (if you happen to personally approve of forcible wireheading, replace that example with some other forcible rewriting of human preferences).

What went wrong there? Well, this doesn't respect "conversation of moral evidence": the AI got the moral values it wanted, but only though the actions it took. This is very close to the omission/commission distinction. We'd want the AI to not take actions (commission) that determines the (expectation of the) moral evidence it gets. Instead, we'd want the moral evidence to accrue "naturally", without interference and manipulation from the AI (omission).

New Comment
7 comments, sorted by Click to highlight new comments since:

Hm. I don't think this is an (c)omission issue and I don't really see what does it have to do with morality. What you are describing is what might be called "judge hacking" -- modifying entity responsible for determining success so that it declares success regardless of what happens (or, in a bit more general case, changing the success criteria).

I'm not sure commission/omission distinction is really the key here. This becomes clearer by inverting the situation a bit:

Some third party is about to forcibly wirehead all of humanity. How should your moral agent reason about whether to intervene and prevent this?

That's interesting - basically here we're trying to educate an AI into human values, but human values are going to swiftly be changed to something different (and bad from our perspective).

I think there's no magical solution - either we build a FAI properly (which is very very hard), and it would stop the third party, or we have an AI that we value load and try and prevent our values from changing while it's happening.

The omission/commission thing applies to value loading AIs, not to traditional FAI. But I admit it's not the best analogy.

That doesn't work because to not wirehead humanity is not the same as doing it and has different implications and whether it was right or wrong as you say won't matter when it is done.. Whereas if you decide to stop someone from doing that NOT stopping them is morally worse (ostensibly) than agreeing with it. Inaction, is itself an action. If there is wrong occurring, to decide to not stop it is just as bad as doing it. However, choosing not to do something bad is not the same as doing it.

I'm arguing against some poorly thought out motivations; eg "don't do anything that most people would disagree with most of the time". This falls apart if you can act to change "people would disagree with" through some means other than preference satisfaction.

That is all fine and well but that is beyond the scope of moral statements. Thats just thinking about why do people make decisions and whether those decisions are good or bad. If you were buddhist you might not do that because you don't want your Karma mixed up with the event going on and so on.

In what contexts, other than AI, might this approach apply?

Teaching children? It strikes me that we sometimes want our kids to make a decision, or announce a decision, so that we can see what their decision would be.

Government prosecutions? It strikes me that the trend of 99%+ of cases being decided by way of a plea deal obscures both the evidence of wrongdoing and the method for judging it wrong.

Max L.