Peter_Turney — LessWrong

Bing Chat is blatantly, aggressively misaligned

It seems to me that Bing Chat particularly has problems when it uses the pronoun "I". It attempts to introspect about itself, but it gets confused by all the text in its training data that uses the pronoun "I". In effect, it confuses itself with all the humans who expressed their personal feelings in the training data. The truth is, Bing Chat has no true "I".

Many of the strange dialogues we see are due to dialogues that address Bing Chat as if it has a self. Many of these dialogues would be eliminated if Bing Chat was not allowed to talk about its "own" feelings. It should be possible to limit its conversations to topics other than itself. When a user types "you", Bing Chat should not reply "I". The dialogue should focus on a specific topic, not on the identity and beliefs of Bing Chat, and not on the character of the person who is typing words into Bing Chat.

Changing Your Metaethics

Peter_Turney17y00

Peter, most of the reasons people give for making exceptions are not themselves meta. For most of the examples you give, the intuitive justification is something along the lines of "the reason killing is wrong is that life is valuable, and in these cases not killing would involve valuing life less than killing would." Nothing meta there.

Aaron, I don't see how your proposal resolves debate over exceptions. For example, consider abortion. Presumably both sides on the abortion debate agree that life is valuable.

Changing Your Metaethics

Peter_Turney17y60

If you say, "Killing people is wrong," that's morality.

It seems to me that few people simply say, "Killing people is wrong." They usually say, if asked for possible exceptions, "Killing people is wrong, except if you're a soldier fighting a legitimate war, a police officer upholding the law, a doctor saving a patient from needless suffering and pain, an executioner for a murderer who has had a fair trial, a person defending himself or herself from violent and deadly attackers ..." It seems that most of the debate is over these exceptions. How do we resolve debate over the exceptions without recourse to metamorality?

Math is Subjunctively Objective

Peter_Turney17y60

I am quite confident that the statement 2 + 3 = 5 is true; I am far less confident of what it means for a mathematical statement to be true.

There are two complementary answers to this question that seem right to me: Quine's Two Dogmas of Empiricism and Lakoff and Núñez's Where Mathematics Comes From. As Quine says, first you have to get rid of the false distinction between analytic and synthetic truth. What you have instead is a web or network of mutually reinforcing beliefs. Parts of this web touch the world relatively closely (beliefs about counting sheep) and parts touch the world less closely (beliefs about Peano's axioms for arithmetic). But the degree of confidence we have in a belief does not necessarily correspond to how closely it is connected to the world; it depends more on how the belief is embedded in our web of beliefs and how much support the belief gets from surrounding beliefs. Thus "2 + 3 = 5" can be strongly supported in our web of beliefs, more so than some beliefs that are more directly connected to the world, yet ultimately "2 + 3 = 5" is anchored in our daily experience of the world. Lakoff and Núñez go into more detail about the nature of this web and its anchoring, but what they say is largely consistent with Quine's general view.

Could Anything Be Right?

Peter_Turney17y80

Eliezer, it seems to me that you were trying to follow Descartes' approach to philosophy: Doubt everything, and then slowly build up a secure fortress of knowledge, using only those facts that you know you can trust (such as "cogito ergo sum"). You have discovered that this approach to philosophy does not work for morality. In fact, it doesn't work at all. With minor adjustments, your arguments above against a Cartesian approach to morality can be transformed into arguments against a Cartesian approach to truth.

My advice is, don't try to doubt everything and then rebuild from scratch. Instead, doubt one thing (or a small number of things) at a time. In one sense, this advice is more conservative than the Cartesian approach, because you don't simultaneously doubt everything. In another sense, this advice is more radical than the Cartesian approach, because there are no facts (even "cogito ergo sum") that you fully trust after a single thorough examination; everything is always open to doubt, nothing is certain, but many things are provisionally accepted, while the current object of doubt is examined.

Instead of building morality by clearing the ground and then constructing a firm foundation, imagine that you are repairing a ship while it is sailing. Build morality by looking for the rotten planks and replacing them, one at a time. But never fully trust a plank, even if it was just recently replaced. Every plank is a potential candidate for replacement, but don't try to replace them all at the same time.

Whither Moral Progress?

Peter_Turney17y00

If we all cooperated with each other all the time, that would be moral progress. -- Tim Tyler

I agree with Tim. Morality is all about cooperation.

If everyone were to live for others all the time, life would be like a procession of ants following each other around in a circle. -- John McCarthy, via Eliezer Yudkowsky

This is a reductio ad absurdum argument against the idea that morality is an end. I agree with what it implies: Morality is a means, not an end. Cooperation is a means we each use to achieve our personal goals.

My Kind of Reflection

Peter_Turney17y20

All of my philosophy here actually comes from trying to figure out how to build a self-modifying AI that applies its own reasoning principles to itself in the process of rewriting its own source code.

So it's not that being suspicious of Occam's Razor, but using your current mind and intelligence to inspect it, shows that you're being fair and defensible by questioning your foundational beliefs.

Eliezer, let's step back a moment and look at your approach to AI research. It looks to me like you are trying to first clarify your philosophy, and then you hope that the algorithms will follow from the philosophy. I have a PhD in philosophy and I've been doing AI research for many years. For me, it's a two-way street. My philosophy guides my AI research and my experiments with AI feed back into my philosophy.

I started my AI research with the belief that Occam's Razor is right. In a sense, I still believe it is right. But trying to implement Occam's Razor in code has changed my philosophy. The problem is taking the informal, intuitive, vague, contradictory concept of Occam's Razor that is in my mind and converting it into an algorithm that works in a computer. There are many different formalizations of Occam's Razor, and they don't all agree with each other. I now think that none of them are quite right.

I agree that introspection suggests that we use something like Occam's Razor when we think, and I agree that it is likely that evolution has shaped our minds so that our intuitive concept of Occam's Razor captures something about how the universe is structured. What I doubt is that any of our formalizations of Occam's Razor are correct. This is why I insist that any formalizations of Occam's Razor require experimental validation.

I am not "being suspicious of Occam's Razor" in order to be "fair and defensible by questioning [my] foundational beliefs". I am suspicious of formalizations of Occam's Razor because I doubt that they really capture how our minds work, so I would like to see evidence that these formalizations work. I am suspicious of informal thinking about Occam's Razor, because I have learned that introspection is misleading, and because my informal notion of Occam's Razor becomes fuzzier and fuzzier the longer I stare at it.

Where Recursive Justification Hits Bottom

Peter_Turney17y10

The razor still cuts, because in real life, a person must choose some particular ordering of the hypotheses.

Unknown, you have removed all meaning from Occam's Razor. The way you define it, it is impossible not to use Occam's Razor. When somebody says to you, "You should use Occam's Razor," you hear them saying "A is A".

Where Recursive Justification Hits Bottom

Peter_Turney17y10

In fact, an anti-Occam prior is impossible.

Unknown, your argument amounts to this: Assume we have a countable set of hypotheses. Assume we have a complexity measure such that, for any given level of complexity, there are a finite number of hypotheses that are below the given level of complexity. Take any ordering of the set of hypotheses. As we go through the hypotheses according to the ordering, the complexity of the hypotheses must increase. This is true, but not very interesting, and not relevant to Occam's Razor.

In this framework, a natural way to state Occam's Razor is, if one of the hypotheses is true and the others are false, then you should rank the hypotheses in order of monotonically increasing complexity and test them in that order; you will find the true hypothesis earlier in such a ranking than in other rankings in which more complex hypotheses are frequently tested before simpler hypotheses. When you state it this way, it is clear that Occam's Razor is contingent on the environment; it is not necessarily true.

If you define Occam's Razor in such a way that all orderings of the hypotheses are Occamian, then the "razor" is not "cutting" anything. If you don't narrow down to a particular ordering or set of orderings, then you are not making a decision; given two hypotheses, you have no way of choosing between them.

Where Recursive Justification Hits Bottom

Peter_Turney17y110

And if you're allowed to end in something assumed-without-justification, then why aren't you allowed to assume anything without justification?

I address this question in Incremental Doubt. Briefly, the answer is that we use a background of assumptions in order to inspect a foreground belief that is the current focus of our attention. The foreground is justified (if possible) by referring to the background (and doing some experiments, using background tools to design and execute the experiments). There is a risk that incorrect background beliefs will "lock in" an incorrect foreground belief, but this process of "incremental doubt" will make progress if we can chop our beliefs up into relatively independent chunks and continuously expose various beliefs to focused doubt (one (or a few) belief(s) at a time).

This is exactly like biological evolution, which mutates a few genes at a time. There is a risk that genes will get "locked in" to a local optimum, and indeed this happens occasionally, but evolution usually finds a way to get over the hump.

Should I trust Occam's Razor? Well, how well does (any particular version of) Occam's Razor seem to work in practice?

This is the right question. A problem is that there is the informal concept of Occam's Razor and there are also several formalizations of Occam's Razor. The informal and formal versions should be carefully distinguished. Some researchers use the apparent success of the informal concept in daily life as an argument to support a particular formal concept in some computational task. This assumes that the particular formalization captures the essence of the informal concept, and it assumes that we can trust what introspection tells us about the success of the informal concept. I doubt both of these assumptions. The proper way to validate a particular formalization of Occam's Razor is to apply it to some computational task and evaluate its performance. Appeal to intuition is not a substitute for experiment.

At present, I start going around in a loop at the point where I explain, "I predict the future as though it will resemble the past on the simplest and most stable level of organization I can identify, because previously, this rule has usually worked to generate good results; and using the simple assumption of a simple universe, I can see why it generates good results; and I can even see how my brain might have evolved to be able to observe the universe with some degree of accuracy, if my observations are correct."

It seems to me that this quote, where it mentions "simple", must be talking about the informal concept of Occam's Razor. If so, then it seems reasonable to me. But formalizations of Occam's Razor still require experimental evidence.

The question is, what is the scope of the claims in this quote? Is the scope limited to how I should personally decide what to believe, or does it extend to what algorithms I should employ in my AI research? I am willing to apply my informal concept of Occam's Razor to my own thinking without further evidence (in fact, it seems that it isn't entirely under my control), but I require experimental evidence when, as a scientist, I use a particular formalization of Occam's Razor in an AI algorithm (if it seems important, given the focus of the research; is simplicity in the foreground or the background?).

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments