Abstracted Idealized Dynamics

[-]Caledonian217y20

No individual particle can be mistaken as to its own behavior. No collection of particles can be mistaken as to its own behavior.

Whether those behaviors properly represent the properties we want them to is another matter. That's why we can say the calculator struck by the cosmic ray is malfunctioning, although all of its parts work perfectly according to physics. 'Physics' is not the set of standards we're referring to, when we speak of the device not working properly.

[-]Will_Newsome14y20

Of course, 'how we want the calculator to work' is just a stand-in that represents not a subgoal of our utility function, but a referent to something outside us we perceive as more objective than what we can see of ourselves. A broken calculator is not wrong because the number it spits out isn't the number we were hoping it would spit out when we started calculating our finances, nor because being misled about such a number would endanger our finances further. (That doesn't mean we should turn the universe into a big calculator, of course; unless it's a calculator that knows how to find and calculate the the most objective and elegant calculations, and not just arbitrary addition. Then maybe.)

[-]Richard417y20

Eliezer - that's all well and good, but what in the world do you think determines which computation or 'abstract idealized dynamic' a mortal human is actually referring to? Won't this be radically underdetermined?

You suggest that "Bob and Sally could be talking about different things when they talk about Enamuh". What's the difference between a world where they're talking about different things vs. a world where they are talking about the same thing but one of them is 'miscalculating'? What facts (about their dispositions and such) would determine which of the two explanations holds, on your view?

[-]TGGP417y00

Calculators disagreeing seems much less common than people disagreeing. To me that it is to be expected because people design calculators to answer mathematical questions for them. Humans themselves are only "designed" by natural selection to make more copies of genes.

The earliest calculators go pretty far back, before what we would know call a "computer". A long time ago I scoffed at the notion of a morality calculator. Would it be possible to build something like it without having achieved General AI?

[+]WTF217y-60

[-]Hopefully_Anonymous17y-10

"Someone sees a slave being whipped, and it doesn't occur to them right away that slavery is wrong. But they go home and think about it, and imagine themselves in the slave's place, and finally think, "No.""

I think lines like this epitomize how messy your approach to understanding human morality as a natural phenomenon is. Richard (the pro), what resources do you recommend I look into to find people taking a more rigorous approach to understanding the phenomenon of human morality (as opposed to promoting a certain type uncritically)?

[-]Psy-Kosh17y10

Richard: Which computation? Well... the computation your brain is, under the hood, performing when you're trying to figure out things about "what should I do?"

The full details of the computation may not be explicitly availible to you, but if you're saying "the thing that your brain is processing when you're considering right&wrong/should&shouldn't isn't what you mean by should&shouldn't", then how could you even be said to mean anything by those words?

[-]Infotropism217y10

"To physically play out this sequence would require many more pebbles than exist in the universe. Does it make sense to ask if the Goodstein sequence which starts with the length of this line of pebbles, "would halt"? Does it make sense to talk about the answer, in a case like this?

I'd say yes, personally."

On the other hand you're an infinite set atheist. How do you make a difference between those two cases ? In neither can it be said the process can exist in the physical universe, which is all there is.

Does it makes more sense just because "infinite" really seems too, too big, while the Goodstein sequence merely seems "big" ? None can exist in the physical universe, that is their similar property. Is that property, of physical implementation, and physical observation, not all that matters in the end ?

Same with the concept of a spaceship that'd disappear through the cosmological horizon of an expanding universe, can't have any causal effect anymore, but still exists.

Can you explain, why, how, is it that you feel confident that those processes do in one case still make sense, yet not in the other ? In a technical way.

[-]GBM17y10

Eliezer, this explanation finally puts it all together for me in terms of the "computation". I get it now, I think.

On the other hand, I have a question. Maybe this indicates that I don't truly get it; maybe it indicates that there's something you're not considering. In any case, I would appreciate your explanation, since I feel so close to understanding what you've been saying.

When I multiply 19 and 103, whether in my head, or using a pocket calculator, I get a certain result that I can check: In theory, I can gather a whole bunch of pebbles, lay them out in 103 rows of 19, and then count them individually. I don't have to rely on my calculator - be it internal or electronic.

When I compute morality, though, the only thing I have to examine is my calculator and a bunch of other ones. I would easily recognize that most calculators I come across will give the same answer to a moral question, at least to a limited number of decimal points. But I have no way of knowing whether those calculators are accurate representations of the world - that is, perhaps all of those calculators were created in a way that didn't reflect reality, and added ten to any result calculated.

If 90% of my calculators say 19 times 103 is equal to 1967, how do I determine that they are incorrect, without having the actual pebbles to count?

[-]Richard417y00

HA - "what resources do you recommend I look into to find people taking a more rigorous approach to understanding the phenomenon of human morality"

If you're interested in the empirical phenomenon, I'm the wrong person to ask. (Maybe start with the SEP on moral psychology?) But on a philosophical level I'd recommend Peter Railton for a sophisticated naturalistic metaethic (that I respect a lot while not entirely agreeing with). He has a recent bloggingheads diavlog, but you can't go past his classic article 'Moral Realism' [here if you have jstor access].

[-]Richard417y00

Psy-Kosh - "Well... the computation your brain is, under the hood, performing when you're trying to figure out things about "what should I do?""

That just pushes my question back a step. Don't the physical facts underdetermine what computation ('abstracted idealized dynamic') my brain might be interpreted as performing? It all depends how you abstract and idealize it, after all. Unless, that is, we think there's some brute (irreducible) facts about which are the right idealizations...

[-]jsalvatier17y30

Richard I think the difference is that in a world where one of them is miscalculating, that person can be shown that they are miscalculating and will then calculate correctly. However, in a world where their idealized calculations are actually significantly different, they would simply become enemies.

[-]Hopefully_Anonymous17y00

Richard, Thanks, the SEP article on moral psychology was an enlightening read.

[-]ShardPhoenix17y00

It seems to me that moral reasoning is only a computation in the sense that all human thought processes are computations. In other words, I'm not sure how helpful this is for AI purposes, other than a reminder that such a thing is possible.

I'm not sure it's possible to extricate the complete underlying rules of human morality from all the other elements of human thought. I don't think it's necessarily impossible either, it just seems like we aren't much closer to the solution.

[-]conchis17y00

Don't the physical facts underdetermine what computation ('abstracted idealized dynamic') my brain might be interpreted as performing?

I would think they do in the same sense that the physical facts always underdetermine the computations that the universe is actually performing. That's obviously a problem with trying to implement anything - though how much of a problem depends on how robust your implementation is to having the wrong model: bridges still stand, even though we don't have a perfect model of the universe. But it doesn't strike me as a problem with the theory per se

(I'm not sure whether you were suggesting it was.)

[-]Toby_Ord217y00

Eliezer,

I agree with most of the distinctions and analogies that you have been pointing out, but I still doubt that I agree with your overall position. No-one here can know whether they agree with your position because it is very much underdetermined by your posts. I can have a go at formulating what I see as the strongest objections to your position if you clearly annunciate it in one place. Oddly enough, the philosophy articles that I read tend to be much more technically precise than your posts. I don't mean that your couldn't write more technically precise posts on metaethics, just that I would like you to.

In the same way as scientific theories need to be clear enough to allow concrete prediction and potential falsification, so philosophical theories need to be clear enough that others can use them without any knowledge of their author to make new claims about their subject matter. Many people here may feel that you have made many telling points (which you have), but I doubt that they understand your theory in the sense that they could apply it in wide range of situations where it is applicable. I would love a short post consisting of at most a paragraph of introduction, then a bi-conditional linking a person's judgement about what another person should do in a given situation to some naturalistic facts and then a paragraph or two helping resolve any ambiguities. Then others can actually argue against it and absence of argument could start to provide some evidence in its favour (though of course, surviving the criticisms of a few grad-student philosophers would still not be all that much evidence).

[-]Tyrrell_McAllister217y00

Eliezer, would the following be an accurate synopsis of what you call morality?

Each of us has an action-evaluating program. This should be thought of as a Turing machine encoded in the hardware of our brains. It is a determinate computational dynamic in our minds that evaluates the actions of agents in scenarios. By a scenario, I mean a mental model of a hypothetical or real situation. Now, a scenario that models agents can also model their action-evaluating programs. An evaluation of an action in a scenario is a moral evaluation if, and only if, the same action is given the same value in every scenario that differs from the first one only in that the agent performing the action has a different action-evaluating program.

In other words, moral evaluations are characterized by being invariant under certain kinds of modifications: Namely, modifications that consist only of assigning a different action-evaluating program to the agent performing the action.

Does that capture the distinctive quality of moral evaluations that you've been trying to convey?

A few thoughts:

(1) It seems strange to me to consider moral evaluations, so defined, to be distinct from personal preferences. With this definition, I would say that moral evaluations are a special case of personal preferences. Specifically, they are the preferences that are invariant under a certain kind of modification to the scenario being considered.

I grant that it is valuable to distinguish this particular kind of personal preference. First, I can imagine that you're right when you say that it's valuable if one wants to build an AI. Second, it's logically interesting because this criterion for moral evaluation is a self-referential one, in that it stipulates how the action-evaluating program (doesn't) react to hypothetical changes to itself. Third, by noting this distinctive kind of action-evaluation, you've probably helped to explain why people are so prone to thinking that certain evaluations are universally valid.

Nonetheless, the point remains that your definition amounts to considering moral evaluation to be nothing more than a particular kind of personal preference. I therefore don't think that it does anything to ease the concerns of moral universalists. Some of your posts included very cogent explanations of why moral universalism is incoherent, but I think you would grant that the points that you raised there weren't particularly original. Moral-relativists have been making those points for a long time. I agree that they make moral universalism untenable, but moral universalists have heard them all before.

Your criterion for moral evaluation, on the other hand, is original (to the best of my meager knowledge). But, so far as the debate between moral relativists and universalists is concerned, it begs the question. It takes the reduction of morality to personal preference as given, and proceeds to define which preferences are the moral ones. I therefore don't expect it to change any minds in that debate.

(2) Viewing moral evaluations as just a special kind of personal preference, what reason is there to think that moral evaluations have their own computational machinery underlying them? I'm sure that this is something that you've thought a lot about, so I'm curious to hear you thoughts on this. My first reaction is to think that, sure, we can distinguish moral evaluations from the other outputs of our preference-establishing machinery, but that doesn't mean that special processes were running to produce the moral evaluations.

For example, consider a program that produces the natural numbers by starting with 1, and then producing each successive number by adding 1 to the previously-produced number. After this machine has produced some output, we can look over the tape and observe that some of the numbers produced have the special property of being prime. We might want to distinguish these numbers from the rest of the output for all sorts of good reasons. There is indeed a special, interesting feature of those numbers. But we should not infer that any special computational machinery produced those special numbers. The prime numbers might be special, but, in this case, the dynamics that produced them are the same as those that produced the non-special composite numbers.

Similarly, moral evaluations, as you define them, are distinguishable from other action-evaluations. But what reason is there to think that any special machinery underlies moral evaluations as opposed to other personal preferences?

(3) Since humans manage to differ in so many of their personal preferences, there seems little reason to think that they are nearly universally unanimous with regards to their moral evaluations. That is, I don't see how the distinguishing feature of moral evaluations (a particular kind of invariance) would make them less likely to differ from person-to-person or moment-to-moment within the same person. So, I don't quite understand your strong reluctance to attribute different moral evaluations to different people.

[-]Eliezer Yudkowsky17y40

Toby, I'm not sure that I understand what you want me to do.

Especially as the main reason I don't dabble in mainstream philosophy is that I consider it too vague for AI purposes. For example, in classical causal decision theory, there's abstruse math done with a function p(x||y) (if I recall the notation correctly) that one is never told how to compute - it's taken as a primitive. Judea Pearl could have told them, but nobody seems to have felt the need to develop the theory further, since they already had what looked to them like math: lots of neat symbols. This kind of "precision" does not impress me.

In general, I am skeptical of dressing up ideas in math that don't deserve the status of math; I consider it academic status-seeking, and I try not to lay claim to such status when I don't feel I've earned it. But if you can say specifically where you're looking for precision, I can try to respond.

[-]TGGP417y10

Your definition of morality as computation seems to have very little to do with morality as actually practiced, as noticed by folks like Haidt (who I mentioned before here recently). Even professors of philosophy gussy up conclusions they arrived at via intuition and still admit they arrived at those beliefs because of intuitions rather than arguments. Eliezer's imagined computation seems to have more to do with justification, which is done after we've already made up our minds, than how people actually conclude things. I am very suspicious about a computer being able to emulate a process people don't actually engage in.

And did anybody find it suspicious that pretty much everybody was explaining what made morality_Bob defective (with plenty of different reasons for this hypothetical person) but nobody was providing any "computation"?

[-]Caledonian217y50

TGGP, don't confuse performing computation with being able to make the computation explicit. Everything 'we' do is computed by our brains, but we can't even begin to describe the mathematics we perform constantly.

Saying that morality is a subset of computation is vacuously true. Everything minds do is a subset of computation.

[-]Richard417y00

jsalvati - "I think the difference is that in a world where one of them is miscalculating, that person can be shown that they are miscalculating and will then calculate correctly."

This still won't do, due to path-dependence and such. Suppose Bob could be corrected in any number of ways, and each will cause him to adopt a different conclusion -- and one that he will then persist in holding no matter what other arguments you give him. Which conclusion is the true value for our original morality_Bob? There can presumably be no fact of the matter, on Eliezer's account. And if this sort of underdetermination is very common (which I imagine it is), then there's probably no facts at all about what any of our "moralities" are. There may always be some schedule of information that would bring us to make radically different moral judgments.

Also worrying is the implication that it's impossible to be stubbornly wrong. Once you become impervious to argument in your adoption of inconsistent moral beliefs, well, those contradictions are now apparently part of your true morality, which you're computing just fine.(?)

[-]Toby_Ord217y00

Eliezer,

I didn't mean that most philosophy papers I read have lots of mathematical symbols (they typically don't), and I agree with you that over-formalization can occur sometimes (though it is probably less common in philosophy than under-formalization). What I meant is the practice of clear and concise statements of the main points and attendant qualifications in the kind of structured English that good philosophers use. For example, I gave the following as a guess at what you might be meaning:

When X judges that Y should Z, X is judging that were she fully informed, she would want Y to Z

This allows X to be incorrect in her judgments (if she wouldn't want Y to Z when given full information). It allows for others to try to persuade X that her judgment is incorrect (it preserves a role for moral argument). It reduces 'should' to mere want (which is arguably simpler). It is, however, a conception of should that is judger-dependent: it could be the case that X correctly judges that Y should Z, while W correctly judges that Y should not Z.

The first line was a fairly clear and concise statement of a meta-ethical position (which you said you don't share, and nor do I for that matter). The next few sentences describe some of its nice features as well as a downside. There is very little technical language -- just 'judge', 'fully informed' and 'want'. In the previous comment I gave a sentence or two saying what was meant by 'fully informed' and if challenged I could have described the other terms. Given that you think it is incorrect, could you perhaps fix it, providing a similar short piece of text that describes your view with a couple of terms that can bear the brunt of further questioning and elaboration.

[-]steven17y160

I'm not Eliezer nor am I a pro, but I think I agree with Eliezer's account, and as a first attempt I think it's something like this...

When X judges that Y should Z, X is judging that Z is the solution to the problem W, where W is a rigid designator for the problem structure implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments. (Or at least X is asserting that it's shared.) Due to the nature of W, becoming informed will cause X and Y to get closer to the solution of W, but wanting-it-when-informed is not what makes that solution moral.

[-]Eliezer Yudkowsky17y30

What Steven said.

[-]Toby_Ord217y50

Great! Now I can see several points where I disagree or would like more information.

1) Is X really asserting that Y shares his ultimate moral framework (i.e. that they would converge given time and arguments etc)?

If Y is a psychopath murderer who will simply never accept that he shouldn't kill, can I still judge that Y should refrain from killing? On the current form, to do so would involve asserting that we share a framework, but even people who know this to be false can judge that he shouldn't kill, can't they?

2) I don't know what it means to be the solution to a problem. You say:

'I should Z' means that Z answers the question, "What will save my people? How can we all have more fun? How can we get more control over our own lives? What's the funniest jokes we can tell? ..."

Suppose Z is the act of saying "no". How does this answer the question (or 'solve the problem')? Suppose it leads you to have a bit less fun and others to have a bit more fun and generally has positive effects on some parts of the question and negative on others. How are these integrated? As you phrased it, it is clearly not a unified question and I don't know what makes one act rather than another an answer to a list of questions (when presumably it doesn't satisfy each one in the list). Is there some complex and not consciously known weighting of the terms? I thought you denied that earlier in the series. This part seems very non-algorithmic at the moment.

3) The interpretation says 'implicitly defined by the machinery ... which they both use to make desirability judgments'?

What if there is not such machinery that they both use? I thought only X's machinery counted here as X is the judger.

4) You will have to say more about 'implicitly defined by the machinery ... use[d] to make desirability judgments'. This is really vague. I know you have said more on this, but never in very precise terms, just by analogy.

5) Is the problem W meant to be the endpoint of thought (i.e. the problem that would be arrived at), or is it meant to be the current idea which involves requests for self modification (e.g. 'Save a lot of lives, promote happiness, and factor in whatever things I have not thought of but could be convinced of.') It is not clear from the current statement (or indeed your previous posts), but would be made clear by a solution to (4).

LESSWRONG
LW

LESSWRONG
LW

38

Abstracted Idealized Dynamics

38

38