OpenAI's GPT-4 Safety Goals

[-]green_leaf2y31

I believe it's important for anyone to be able to move to a society that accepts something like same-sex marriage. But that doesn't imply that I ought to be intolerant of societies that want different marriage rules.

There are two problems with these two normative statements. The first one is impossible - majority of people don't and won't have resources to move to another society. The second one relativizes morality - it doesn't follow that if a society "wants" by a majority vote a rule X, that outsiders should be tolerant of them imposing that rule on their members.

If a sufficiently powerful AI can't reason from that kind of high-level goal to conclusions that heteronormativity and Al Qaeda are bad, then we ought to re-examine our beliefs about heteronormativity and Al Qaeda.

The utility function is not up for grabs. Also, there is no reason to expect that morality is implied by people's preferences being satisfied.

[-]PeterMcCluskey2y-1-3

I'm "relativizing" morality in the sense that Henrich does in The Secret of Our Success and The WEIRDest People in the World: it's mostly a package of heuristics that is fairly well adapted to particular conditions. Humans are not wise enough to justify much confidence in beliefs about which particular heuristics ought to be universalized.

To the extent that a utility function is useful for describing human values, I agree that it is not up for grabs. I'm observing that "satisfy preferences" is closer to a good summary of human utility functions than are particular rules about marriage or about Al Qaeda.

David Friedman's book Law's Order is, in part, an extended argument for that position:

One objection to the economic approach to understanding the logic of law is that law may have no logic to understand. Another and very different objection is that law has a logic but that it is, or at least ought to be, concerned not with economic efficiency but with justice. ... My second answer is that in many, although probably not all, cases it turns out that the rules we thought we supported because they were just are in fact efficient. To make that clearer I have chosen to ignore entirely issues of justice going into the analysis. In measuring the degree to which legal rules succeed in giving everyone what he wants, and judging them accordingly, I treat on an exactly equal plane my desire to keep my property and a thief’s desire to take it. Despite that, as you will see, quite a lot of what looks like justice—for example, laws against theft and the requirement that people who make messes should clean them up—comes out the other end. That, I think, is interesting.

[-]green_leaf2y10

it's mostly a package of heuristics that is fairly well adapted to particular conditions

This could either mean that morality exists, and it's the heuristics, or it could mean that morality doesn't exist, only the heuristics does.

If it means the former, why think that the heuristics we have happens to have gotten morality exactly right? If it means the latter, there are no moral obligations, and so there is nothing morally wrong with interfering with societies that don't allow equality of marriage rights for same-sex (or transgender) couples.

I'm observing that "satisfy preferences" is closer to a good summary of human utility functions than are particular rules about marriage or about Al Qaeda.

Here we're not talking about morality (about what is right) anymore. If morality doesn't exist, I don't see why I should help people whose actions I strongly disprefer just because there are, let's say, 5 of them and only 4 people like me. If preferences don't have normative power, why should I care that people whose actions I strongly disapprove of would be, in worlds without marriage equality, so satisfied that it would offset the satisfaction of the people I approve of by, let's say, 0.1% (or by any other number)?

Without morality, there is, by definition, no normative argument to make.

LESSWRONG
LW

LESSWRONG
LW

3

OpenAI's GPT-4 Safety Goals

3

3

Heteronormativity and Amish Culture

Training AIs to Deceive Us

Minimizing Arms Races

But Our Culture is Better

A Meta Approach

Closing Thoughts