Continuation of:  Changing Your Metaethics, Setting Up Metaethics
Followup toDoes Your Morality Care What You Think?, The Moral Void, Probability is Subjectively Objective, Could Anything Be Right?, The Gift We Give To Tomorrow, Rebelling Within Nature, Where Recursive Justification Hits Bottom, ...

(The culmination of a long series of Overcoming Bias posts; if you start here, I accept no responsibility for any resulting confusion, misunderstanding, or unnecessary angst.)

What is morality?  What does the word "should", mean?  The many pieces are in place:  This question I shall now dissolve.

The key—as it has always been, in my experience so far—is to understand how a certain cognitive algorithm feels from inside.  Standard procedure for righting a wrong question:  If you don't know what right-ness is, then take a step beneath and ask how your brain labels things "right".

It is not the same question—it has no moral aspects to it, being strictly a matter of fact and cognitive science.  But it is an illuminating question.  Once we know how our brain labels things "right", perhaps we shall find it easier, afterward, to ask what is really and truly right.

But with that said—the easiest way to begin investigating that question, will be to jump back up to the level of morality and ask what seems right.  And if that seems like too much recursion, get used to it—the other 90% of the work lies in handling recursion properly.

(Should you find your grasp on meaningfulness wavering, at any time following, check Changing Your Metaethics for the appropriate prophylactic.)

So!  In order to investigate how the brain labels things "right", we are going to start out by talking about what is right.  That is, we'll start out wearing our morality-goggles, in which we consider morality-as-morality and talk about moral questions directly.  As opposed to wearing our reduction-goggles, in which we talk about cognitive algorithms and mere physics.  Rigorously distinguishing between these two views is the first step toward mating them together.

As a first step, I offer this observation, on the level of morality-as-morality:  Rightness is contagious backward in time.

Suppose there is a switch, currently set to OFF, and it is morally desirable for this switch to be flipped to ON.  Perhaps the switch controls the emergency halt on a train bearing down on a child strapped to the railroad tracks, this being my canonical example.  If this is the case, then, ceteris paribus and presuming the absence of exceptional conditions or further consequences that were not explicitly specified, we may consider it right that this switch should be flipped.

If it is right to flip the switch, then it is right to pull a string that flips the switch.  If it is good to pull a string that flips the switch, it is right and proper to press a button that pulls the string:  Pushing the button seems to have more should-ness than not pushing it.

It seems that—all else being equal, and assuming no other consequences or exceptional conditions which were not specified—value flows backward along arrows of causality.

Even in deontological moralities, if you're obligated to save the child on the tracks, then you're obligated to press the button.  Only very primitive AI systems have motor outputs controlled by strictly local rules that don't model the future at all.  Duty-based or virtue-based ethics are only slightly less consequentialist than consequentialism.  It's hard to say whether moving your arm left or right is more virtuous without talking about what happens next.

Among my readers, there may be some who presently assert—though I hope to persuade them otherwise—that the life of a child is of no value to them.  If so, they may substitute anything else that they prefer, at the end of the switch, and ask if they should press the button.

But I also suspect that, among my readers, there are some who wonder if the true morality might be something quite different from what is presently believed among the human kind.  They may find it imaginable—plausible?—that human life is of no value, or negative value.  They may wonder if the goodness of human happiness, is as much a self-serving delusion as the justice of slavery.

I myself was once numbered among these skeptics, because I was always very suspicious of anything that looked self-serving.

Now here's a little question I never thought to ask, during those years when I thought I knew nothing about morality:

Could make sense to have a morality in which, if we should save the child from the train tracks, then we should not flip the switch, should pull the string, and should not push the button, so that, finally, we do not push the button?

Or perhaps someone says that it is better to save the child, than to not save them; but doesn't see why anyone would think this implies it is better to press the button than not press it.  (Note the resemblance to the Tortoise who denies modus ponens.)

It seems imaginable, to at least some people, that entirely different things could be should.  It didn't seem nearly so imaginable, at least to me, that should-ness could fail to flow backward in time.  When I was trying to question everything else, that thought simply did not occur to me.

Can you question it?  Should you?

Every now and then, in the course of human existence, we question what should be done and what is right to do, what is better or worse; others come to us with assertions along these lines, and we question them, asking "Why is it right?"  Even when we believe a thing is right (because someone told us that it is, or because we wordlessly feel that it is) we may still question why it is right.

Should-ness, it seems, flows backward in time.  This gives us one way to question why or whether a particular event has the should-ness property.  We can look for some consequence that has the should-ness property.  If so, the should-ness of the original event seems to have been plausibly proven or explained.

Ah, but what about the consequence—why is it should?  Someone comes to you and says, "You should give me your wallet, because then I'll have your money, and I should have your money."  If, at this point, you stop asking questions about should-ness, you're vulnerable to a moral mugging.

So we keep asking the next question.  Why should we press the button?  To pull the string.  Why should we pull the string?  To flip the switch.  Why should we flip the switch?  To pull the child from the railroad tracks.  Why pull the child from the railroad tracks?  So that they live.  Why should the child live?

Now there are people who, caught up in the enthusiasm, go ahead and answer that question in the same style: for example, "Because the child might eventually grow up and become a trade partner with you," or "Because you will gain honor in the eyes of others," or "Because the child may become a great scientist and help achieve the Singularity," or some such.  But even if we were to answer in this style, it would only beg the next question.

Even if you try to have a chain of should stretching into the infinite future—a trick I've yet to see anyone try to pull, by the way, though I may be only ignorant of the breadths of human folly—then you would simply ask "Why that chain rather than some other?"

Another way that something can be should, is if there's a general rule that makes it should.  If your belief pool starts out with the general rule "All children X:  It is better for X to live than to die", then it is quite a short step to "It is better for Stephanie to live than to die".  Ah, but why save all children?  Because they may all become trade partners or scientists?  But then where did that general rule come from?

If should-ness only comes from should-ness—from a should-consequence, or from a should-universal—then how does anything end up should in the first place?

Now human beings have argued these issues for thousands of years and maybe much longer.  We do not hesitate to continue arguing when we reach a terminal value (something that has a charge of should-ness independently of its consequences).  We just go on arguing about the universals.

I usually take, as my archetypal example, the undoing of slavery:  Somehow, slaves' lives went from having no value to having value.  Nor do I think that, back at the dawn of time, anyone was even trying to argue that slaves were better off being slaves (as it would be latter argued).  They'd probably have looked at you like you were crazy if you even tried.  Somehow, we got from there, to here...

And some of us would even hold this up as a case of moral progress, and look at our ancestors as having made a moral error.  Which seems easy enough to describe in terms of should-ness:  Our ancestors thought that they should enslave defeated enemies, but they were mistaken.

But all our philosophical arguments ultimately seem to ground in statements that no one has bothered to justify—except perhaps to plead that they are self-evident, or that any reasonable mind must surely agree, or that they are a priori truths, or some such.  Perhaps, then, all our moral beliefs are as erroneous as that old bit about slavery?  Perhaps we have entirely misperceived the flowing streams of should?

So I once believed was plausible; and one of the arguments I wish I could go back and say to myself, is, "If you know nothing at all about should-ness, then how do you know that the procedure, 'Do whatever Emperor Ming says' is not the entirety of should-ness?  Or even worse, perhaps, the procedure, 'Do whatever maximizes inclusive genetic fitness' or 'Do whatever makes you personally happy'."  The point here would have been to make my past self see that in rejecting these rules, he was asserting a kind of knowledge—that to say, "This is not morality," he must reveal that, despite himself, he knows something about morality or meta-morality.  Otherwise, the procedure "Do whatever Emperor Ming says" would seem just as plausible, as a guiding principle, as his current path of "Rejecting things that seem unjustified."  Unjustified—according to what criterion of justification?  Why trust the principle that says that moral statements need to be justified, if you know nothing at all about morality?

What indeed would distinguish, at all, the question "What is right?" from "What is wrong?"

What is "right", if you can't say "good" or "desirable" or "better" or "preferable" or "moral" or "should"?  What happens if you try to carry out the operation of replacing the symbol with what it stands for?

If you're guessing that I'm trying to inveigle you into letting me say:  "Well, there are just some things that are baked into the question, when you start asking questions about morality, rather than wakalixes or toaster ovens", then you would be right.  I'll be making use of that later, and, yes, will address "But why should we ask that question?"

Okay, now: morality-goggles off, reduction-goggles on.

Those who remember Possibility and Could-ness, or those familiar with simple search techniques in AI, will realize that the "should" label is behaving like the inverse of the "could" label, which we previously analyzed in terms of "reachability".  Reachability spreads forward in time: if I could reach the state with the button pressed, I could reach the state with the string pulled; if I could reach the state with the string pulled, I could reach the state with the switch flipped.

Where the "could" label and the "should" label collide, the algorithm produces a plan.

Now, as I say this, I suspect that at least some readers may find themselves fearing that I am about to reduce should-ness to a mere artifact of a way that a planning system feels from inside.  Once again I urge you to check Changing Your Metaethics, if this starts to happen.  Remember above all the Moral Void:  Even if there were no morality, you could still choose to help people rather than hurt them.  This, above all, holds in place what you hold precious, while your beliefs about the nature of morality change.

I do not intend, with this post, to take away anything of value; it will all be given back before the end.

Now this algorithm is not very sophisticated, as AI algorithms go, but to apply it in full generality—to learned information, not just ancestrally encountered, genetically programmed situations—is a rare thing among animals.  Put a food reward in a transparent box.  Put the matching key, which looks unique and uniquely corresponds to that box, in another transparent box.  Put the unique key to that box in another box.  Do this with five boxes.  Mix in another sequence of five boxes that doesn't lead to a food reward.  Then offer a choice of two keys, one of which starts the sequence of five boxes leading to food, one of which starts the sequence leading nowhere.

Chimpanzees can learn to do this, but so far as I know, no non-primate species can pull that trick.

And as smart as chimpanzees are, they are not quite as good as humans at inventing plans—plans such as, for example, planting in the spring to harvest in the fall.

So what else are humans doing, in the way of planning?

It is a general observation that natural selection seems to reuse existing complexity, rather than creating things from scratch, whenever it possibly can—though not always in the same way that a human engineer would.  It is a function of the enormous time required for evolution to create machines with many interdependent parts, and the vastly shorter time required to create a mutated copy of something already evolved.

What else are humans doing?  Quite a bit, and some of it I don't understand—there are plans humans make, that no modern-day AI can.

But one of the things we are doing, is reasoning about "right-ness" the same way we would reason about any other observable property.

Are animals with bright colors often poisonous?  Does the delicious nid-nut grow only in the spring?  Is it usually a good idea to take with a waterskin on long hunts?

It seems that Martha and Fred have an obligation to take care of their child, and Jane and Bob are obligated to take care of their child, and Susan and Wilson have a duty to care for their child.  Could it be that parents in general must take care of their children?

By representing right-ness as an attribute of objects, you can recruit a whole previously evolved system that reasons about the attributes of objects.  You can save quite a lot of planning time, if you decide (based on experience) that in general it is a good idea to take a waterskin on hunts, from which it follows that it must be a good idea to take a waterskin on hunt #342.

Is this damnable for a Mind Projection Fallacy—treating properties of the mind as if they were out there in the world?

Depends on how you look at it.

This business of, "It's been a good idea to take waterskins on the last three hunts, maybe it's a good idea in general, if so it's a good idea to take a waterskin on this hunt", does seem to work.

Let's say that your mind, faced with any countable set of objects, automatically and perceptually tagged them with their remainder modulo 5.  If you saw a group of 17 objects, for example, they would look remainder-2-ish.  Though, if you didn't have any notion of what your neurons were doing, and perhaps no notion of modulo arithmetic, you would only see that the group of 17 objects had the same remainder-ness as a group of 2 objects.  You might not even know how to count—your brain doing the whole thing automatically, subconsciously and neurally—in which case you would just have five different words for the remainder-ness attributes that we would call 0, 1, 2, 3, and 4.

If you look out upon the world you see, and guess that remainder-ness is a separate and additional attribute of things—like the attribute of having an electric charge—or like a tiny little XML tag hanging off of things—then you will be wrong.  But this does not mean it is nonsense to talk about remainder-ness, or that you must automatically commit the Mind Projection Fallacy in doing so.  So long as you've got a well-defined way to compute a property, it can have a well-defined output and hence an empirical truth condition.

If you're looking at 17 objects, then their remainder-ness is, indeed and truly, 2, and not 0, 3, 4, or 1.  If I tell you, "Those red things you told me to look at are remainder-2-ish", you have indeed been told a falsifiable and empirical property of those red things.  It is just not a separate, additional, physically existent attribute.

And as for reasoning about derived properties, and which other inherent or derived properties they correlate to—I don't see anything inherently fallacious about that.

One may notice, for example, that things which are 7 modulo 10 are often also 2 modulo 5.  Empirical observations of this sort play a large role in mathematics, suggesting theorems to prove.  (See Polya's How To Solve It.)

Indeed, virtually all the experience we have, is derived by complicated neural computations from the raw physical events impinging on our sense organs.  By the time you see anything, it has been extensively processed by the retina, lateral geniculate nucleus, visual cortex, parietal cortex, and temporal cortex, into a very complex sort of derived computational property.

If you thought of a property like redness as residing strictly in an apple, you would be committing the Mind Projection Fallacy.  The apple's surface has a reflectance which sends out a mixture of wavelengths that impinge on your retina and are processed with respect to ambient light to extract a summary color of red...  But if you tell me that the apple is red, rather than green, and make no claims as to whether this is an ontologically fundamental physical attribute of the apple, then I am quite happy to agree with you.

So as long as there is a stable computation involved, or a stable process—even if you can't consciously verbalize the specification—it often makes a great deal of sense to talk about properties that are not fundamental.  And reason about them, and remember where they have been found in the past, and guess where they will be found next.

(In retrospect, that should have been a separate post in the Reductionism sequence.  "Derived Properties", or "Computational Properties" maybe.  Oh, well; I promised you morality this day, and this day morality you shall have.)

Now let's say we want to make a little machine, one that will save the lives of children.  (This enables us to save more children than we could do without a machine, just like you can move more dirt with a shovel than by hand.)  The machine will be a planning machine, and it will reason about events that may or may not have the property, leads-to-child-living. 

A simple planning machine would just have a pre-made model of the environmental process.  It would search forward from its actions, applying a label that we might call "reachable-from-action-ness", but which might as well say "Xybliz" internally for all that it matters to the program.  And it would search backward from scenarios, situations, in which the child lived, labeling these "leads-to-child-living".  If situation X leads to situation Y, and Y has the label "leads-to-child-living"—which might just be a little flag bit, for all the difference it would make—then X will inherit the flag from Y.  When the two labels meet in the middle, the leads-to-child-living flag will quickly trace down the stored path of reachability, until finally some particular sequence of actions ends up labeled "leads-to-child-living".  Then the machine automatically executes those actions—that's just what the machine does.

Now this machine is not complicated enough to feel existential angst.  It is not complicated enough to commit the Mind Projection Fallacy.  It is not, in fact, complicated enough to reason abstractly about the property "leads-to-child-living-ness".  The machine—as specified so far—does not notice if the action "jump in the air" turns out to always have this property, or never have this property.  If "jump in the air" always led to situations in which the child lived, this could greatly simplify future planning—but only if the machine were sophisticated enough to notice this fact and use it.

If it is a fact that "jump in the air" "leads-to-child-living-ness", this fact is composed of empirical truth and logical truth.  It is an empirical truth that if the world is such that if you perform the (ideal abstract) algorithm "trace back from situations where the child lives", then it will be a logical truth about the output of this (ideal abstract) algorithm that it labels the "jump in the air" action.

(You cannot always define this fact in entirely empirical terms, by looking for the physical real-world coincidence of jumping and child survival.  It might be that "stomp left" also always saves the child, and the machine in fact stomps left.  In which case the fact that jumping in the air would have saved the child, is a counterfactual extrapolation.)

Okay, now we're ready to bridge the levels.

As you must surely have guessed by now, this should-ness stuff is how the human decision algorithm feels from inside.  It is not an extra, physical, ontologically fundamental attribute hanging off of events like a tiny little XML tag.

But it is a moral question what we should do about that—how we should react to it.

To adopt an attitude of complete nihilism, because we wanted those tiny little XML tags, and they're not physically there, strikes me as the wrong move.  It is like supposing that the absence of an XML tag, equates to the XML tag being there, saying in its tiny brackets what value we should attach, and having value zero.  And then this value zero, in turn, equating to a moral imperative to wear black, feel awful, write gloomy poetry, betray friends, and commit suicide.

No.

So what would I say instead?

The force behind my answer is contained in The Moral Void and The Gift We Give To Tomorrow.  I would try to save lives "even if there were no morality", as it were.

And it seems like an awful shame to—after so many millions and hundreds of millions of years of evolution—after the moral miracle of so much cutthroat genetic competition producing intelligent minds that love, and hope, and appreciate beauty, and create beauty—after coming so far, to throw away the Gift of morality, just because our brain happened to represent morality in such fashion as to potentially mislead us when we reflect on the nature of morality.

This little accident of the Gift doesn't seem like a good reason to throw away the Gift; it certainly isn't a inescapable logical justification for wearing black.

Why not keep the Gift, but adjust the way we reflect on it?

So here's my metaethics:

I earlier asked,

What is "right", if you can't say "good" or "desirable" or "better" or "preferable" or "moral" or "should"?  What happens if you try to carry out the operation of replacing the symbol with what it stands for?

I answer that if you try to replace the symbol "should" with what it stands for, you end up with quite a large sentence.

For the much simpler save-life machine, the "should" label stands for leads-to-child-living-ness.

For a human this is a much huger blob of a computation that looks like, "Did everyone survive?  How many people are happy?  Are people in control of their own lives? ..."  Humans have complex emotions, have many values—the thousand shards of desire, the godshatter of natural selection.  I would say, by the way, that the huge blob of a computation is not just my present terminal values (which I don't really have—I am not a consistent expected utility maximizers); the huge blob of a computation includes the specification of those moral arguments, those justifications, that would sway me if I heard them.  So that I can regard my present values, as an approximation to the ideal morality that I would have if I heard all the arguments, to whatever extent such an extrapolation is coherent.

No one can write down their big computation; it is not just too large, it is also unknown to its user.  No more could you print out a listing of the neurons in your brain.  You never mention your big computation—you only use it, every hour of every day.

Now why might one identify this enormous abstract computation, with what-is-right?

If you identify rightness with this huge computational property, then moral judgments are subjunctively objective (like math), subjectively objective (like probability), and capable of being true (like counterfactuals).

You will find yourself saying, "If I wanted to kill someone—even if I thought it was right to kill someone—that wouldn't make it right."  Why?  Because what is right is a huge computational property—an abstract computation—not tied to the state of anyone's brain, including your own brain.

This distinction was introduced earlier in 2-Place and 1-Place Words.  We can treat the word "sexy" as a 2-place function that goes out and hoovers up someone's sense of sexiness, and then eats an object of admiration.  Or we can treat the word "sexy" as meaning a 1-place function, a particular sense of sexiness, like Sexiness_20934, that only accepts one argument, an object of admiration.

Here we are treating morality as a 1-place function.  It does not accept a person as an argument, spit out whatever cognitive algorithm they use to choose between actions, and then apply that algorithm to the situation at hand.  When I say right, I mean a certain particular 1-place function that just asks, "Did the child live?  Did anyone else get killed?  Are people happy?  Are they in control of their own lives?  Has justice been served?" ... and so on through many, many other elements of rightness.  (And perhaps those arguments that might persuade me otherwise, which I have not heard.)

Hence the notion, "Replace the symbol with what it stands for."

Since what's right is a 1-place function, if I subjunctively imagine a world in which someone has slipped me a pill that makes me want to kill people, then, in this subjunctive world, it is not right to kill people.  That's not merely because I'm judging with my current brain.  It's because when I say right, I am referring to a 1-place function.  Rightness doesn't go out and hoover up the current state of my brain, in this subjunctive world, before producing the judgment "Oh, wait, it's now okay to kill people."  When I say right, I don't mean "that which my future self wants", I mean the function that looks at a situation and asks, "Did anyone get killed?  Are people happy?  Are they in control of their own lives?  ..."

And once you've defined a particular abstract computation that says what is right—or even if you haven't defined it, and it's computed in some part of your brain you can't perfectly print out, but the computation is stable—more or less—then as with any other derived property, it makes sense to speak of a moral judgment being true. If I say that today was a good day, you've learned something empirical and falsifiable about my day—if it turns out that actually my grandmother died, you will suspect that I was originally lying.

The apparent objectivity of morality has just been explained—and not explained away.  For indeed, if someone slipped me a pill that made me want to kill people, nonetheless, it would not be right to kill people.  Perhaps I would actually kill people, in that situation—but that is because something other than morality would be controlling my actions.

Morality is not just subjunctively objective, but subjectively objective.  I experience it as something I cannot change.  Even after I know that it's myself who computes this 1-place function, and not a rock somewhere—even after I know that I will not find any star or mountain that computes this function, that only upon me is it written—even so, I find that I wish to save lives, and that even if I could change this by an act of will, I would not choose to do so.  I do not wish to reject joy, or beauty, or freedom.  What else would I do instead?  I do not wish to reject the Gift that natural selection accidentally barfed into me.  This is the principle of The Moral Void and The Gift We Give To Tomorrow.

Our origins may seem unattractive, our brains untrustworthy.

But love has to enter the universe somehow, starting from non-love, or love cannot enter time.

And if our brains are untrustworthy, it is only our own brains that say so.  Do you sometimes think that human beings are not very nice?  Then it is you, a human being, who says so.  It is you, a human being, who judges that human beings could do better.  You will not find such written upon the stars or the mountains: they are not minds, they cannot think.

In this, of course, we find a justificational strange loop through the meta-level.  Which is unavoidable so far as I can see—you can't argue morality, or any kind of goal optimization, into a rock.  But note the exact structure of this strange loop: there is no general moral principle which says that you should do what evolution programmed you to do.  There is, indeed, no general principle to trust your moral intuitions!  You can find a moral intuition within yourself, describe it—quote it—consider it deliberately and in the full light of your entire morality, and reject it, on grounds of other arguments.  What counts as an argument is also built into the rightness-function.

Just as, in the strange loop of rationality, there is no general principle in rationality to trust your brain, or to believe what evolution programmed you to believe—but indeed, when you ask which parts of your brain you need to rebel against, you do so using your current brain.  When you ask whether the universe is simple, you can consider the simple hypothesis that the universe's apparent simplicity is explained by its actual simplicity.

Rather than trying to unwind ourselves into rocks, I proposed that we should use the full strength of our current rationality, in reflecting upon ourselves—that no part of ourselves be immune from examination, and that we use all of ourselves that we currently believe in to examine it.

You would do the same thing with morality; if you consider that a part of yourself might be considered harmful, then use your best current guess at what is right, your full moral strength, to do the considering.  Why should we want to unwind ourselves to a rock?  Why should we do less than our best, when reflecting?  You can't unwind past Occam's Razor, modus ponens, or morality and it's not clear why you should try.

For any part of rightness, you can always imagine another part that overrides it—it would not be right to drag the child from the train tracks, if this resulted in everyone on Earth becoming unable to love—or so I would judge.  For every part of rightness you examine, you will find that it cannot be the sole and perfect and only criterion of rightness.  This may lead to the incorrect inference that there is something beyond, some perfect and only criterion from which all the others are derived—but that does not follow.  The whole is the sum of the parts.  We ran into an analogous situation with free will, where no part of ourselves seems perfectly decisive.

The classic dilemma for those who would trust their moral intuitions, I believe, is the one who says:  "Interracial marriage is repugnant—it disgusts me—and that is my moral intuition!"  I reply, "There is no general rule to obey your intuitions.  You just mentioned intuitions, rather than using them.  Very few people have legitimate cause to mention intuitions—Friendly AI programmers, for example, delving into the cognitive science of things, have a legitimate reason to mention them.  Everyone else just has ordinary moral arguments, in which they use their intuitions, for example, by saying, 'An interracial marriage doesn't hurt anyone, if both parties consent'.  I do not say, 'And I have an intuition that anything consenting adults do is right, and all intuitions must be obeyed, therefore I win.'  I just offer up that argument, and any others I can think of, to weigh in the balance."

Indeed, evolution that made us cannot be trusted—so there is no general principle to trust it!  Rightness is not defined in terms of automatic correspondence to any possible decision we actually make—so there's no general principle that says you're infallible!  Just do what is, ahem, right—to the best of your ability to weigh the arguments you have heard, and ponder the arguments you may not have heard.

If you were hoping to have a perfectly trustworthy system, or to have been created in correspondence with a perfectly trustworthy morality—well, I can't give that back to you; but even most religions don't try that one.  Even most religions have the human psychology containing elements of sin, and even most religions don't actually give you an effectively executable and perfect procedure, though they may tell you "Consult the Bible!  It always works!"

If you hoped to find a source of morality outside humanity—well, I can't give that back, but I can ask once again:  Why would you even want that?  And what good would it do?  Even if there were some great light in the sky—something that could tell us, "Sorry, happiness is bad for you, pain is better, now get out there and kill some babies!"—it would still be your own decision to follow it.  You cannot evade responsibility.

There isn't enough mystery left to justify reasonable doubt as to whether the causal origin of morality is something outside humanity.  We have evolutionary psychology.  We know where morality came from.  We pretty much know how it works, in broad outline at least.  We know there are no little XML value tags on electrons (and indeed, even if you found them, why should you pay attention to what is written there?)

If you hoped that morality would be universalizable—sorry, that one I really can't give back.  Well, unless we're just talking about humans.  Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence; and a great and reasonable doubt as to whether any present disagreement is really unresolvable, even it seems to be about "values".  The obvious reason for hope is the psychological unity of humankind, and the intuitions of symmetry, universalizability, and simplicity that we execute in the course of our moral arguments.  (In retrospect, I should have done a post on Interpersonal Morality before this...)

If I tell you that three people have found a pie and are arguing about how to divide it up, the thought "Give one-third of the pie to each" is bound to occur to you—and if the three people are humans, it's bound to occur to them, too.  If one of them is a psychopath and insists on getting the whole pie, though, there may be nothing for it but to say:  "Sorry, fairness is not 'what everyone thinks is fair', fairness is everyone getting a third of the pie".  You might be able to resolve the remaining disagreement by politics and game theory, short of violence—but that is not the same as coming to agreement on values.  (Maybe you could persuade the psychopath that taking a pill to be more human, if one were available, would make them happier?  Would you be justified in forcing them to swallow the pill?  These get us into stranger waters that deserve a separate post.)

If I define rightness to include the space of arguments that move me, then when you and I argue about what is right, we are arguing our approximations to what we would come to believe if we knew all empirical facts and had a million years to think about it—and that might be a lot closer than the present and heated argument.  Or it might not.  This gets into the notion of 'construing an extrapolated volition' which would be, again, a separate post.

But if you were stepping outside the human and hoping for moral arguments that would persuade any possible mind, even a mind that just wanted to maximize the number of paperclips in the universe, then sorry—the space of possible mind designs is too large to permit universally compelling arguments.  You are better off treating your intuition that your moral arguments ought to persuade others, as applying only to other humans who are more or less neurologically intact.  Trying it on human psychopaths would be dangerous, yet perhaps possible.  But a paperclip maximizer is just not the sort of mind that would be moved by a moral argument.  (This will definitely be a separate post.)

Once, in my wild and reckless youth, I tried dutifully—I thought it was my duty—to be ready and willing to follow the dictates of a great light in the sky, an external objective morality, when I discovered it.  I questioned everything, even altruism toward human lives, even the value of happiness.  Finally I realized that there was no foundation but humanity—no evidence pointing to even a reasonable doubt that there was anything else—and indeed I shouldn't even want to hope for anything else—and indeed would have no moral cause to follow the dictates of a light in the sky, even if I found one.

I didn't get back immediately all the pieces of myself that I had tried to deprecate—it took time for the realization "There is nothing else" to sink in.  The notion that humanity could just... you know... live and have fun... seemed much too good to be true, so I mistrusted it.  But eventually, it sank in that there really was nothing else to take the place of beauty.  And then I got it back.

So you see, it all really does add up to moral normality, very exactly in fact.  You go on with the same morals as before, and the same moral arguments as before.  There is no sudden Grand Overlord Procedure to which you can appeal to get a perfectly trustworthy answer.  You don't know, cannot print out, the great rightness-function; and even if you could, you would not have enough computational power to search the entire specified space of arguments that might move you.  You will just have to argue it out.

I suspect that a fair number of those who propound metaethics do so in order to have it add up to some new and unusual moral—else why would they bother?  In my case, I bother because I am a Friendly AI programmer and I have to make a physical system outside myself do what's right; for which purpose metaethics becomes very important indeed.  But for the most part, the effect of my proffered metaethic is threefold:

  • Anyone worried that reductionism drains the meaning from existence can stop worrying;
  • Anyone who was rejecting parts of their human existence based on strange metaethics—i.e., "Why should I care about others, if that doesn't help me maximize my inclusive genetic fitness?"—can welcome back all the parts of themselves that they once exiled.
  • You can stop arguing about metaethics, and go back to whatever ordinary moral argument you were having before then.  This knowledge will help you avoid metaethical mistakes that mess up moral arguments, but you can't actually use it to settle debates unless you can build a Friendly AI.

And, oh yes—why is it right to save a child's life?

Well... you could ask "Is this event that just happened, right?" and find that the child had survived, in which case you would have discovered the nonobvious empirical fact about the world, that it had come out right.

Or you could start out already knowing a complicated state of the world, but still have to apply the rightness-function to it in a nontrivial way—one involving a complicated moral argument, or extrapolating consequences into the future—in which case you would learn the nonobvious logical / computational fact that rightness, applied to this situation, yielded thumbs-up.

In both these cases, there are nonobvious facts to learn, which seem to explain why what just happened is right.

But if you ask "Why is it good to be happy?" and then replace the symbol 'good' with what it stands for, you'll end up with a question like "Why does happiness match {happiness + survival + justice + individuality + ...}?"  This gets computed so fast, that it scarcely seems like there's anything there to be explained.  It's like asking "Why does 4 = 4?" instead of "Why does 2 + 2 = 4?"

Now, I bet that feels quite a bit like what happens when I ask you:  "Why is happiness good?"

Right?

And that's also my answer to Moore's Open Question.  Why is this big function I'm talking about, rightBecause when I say "that big function", and you say "right", we are dereferencing two different pointers to the same unverbalizable abstract computation.  I mean, that big function I'm talking about, happens to be the same thing that labels things right in your own brain.  You might reflect on the pieces of the quotation of the big function, but you would start out by using your sense of right-ness to do it.  If you had the perfect empirical knowledge to taboo both "that big function" and "right", substitute what the pointers stood for, and write out the full enormity of the resulting sentence, it would come out as... sorry, I can't resist this one... A=A.

 

Part of The Metaethics Sequence

Next post: "Interpersonal Morality"

Previous post: "Setting Up Metaethics"

The Meaning of Right
New Comment
156 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

There is a good tradition of expecting intellectuals to summarize their positions. Even if they write long books elaborating their positions, intellectuals are still expected to write sentences that summarize their core claims. Those sentences may refer to new concepts they have elaborated elsewhere, but still the summary is important. I think you'd do well to try to write such summaries of your key positions, including this one.

You say morality is "the huge blob of a computation ... not just our present terminal values ... [but] includes the specification of those moral arguments, those justifications, that would sway us if we heard them." If you mean what would sway us to matching acts, then you mean morality is what we would want if we had thought everything through. But if you instead mean what would sway us only to assent that an act is "moral", even if we are not swayed to act that way, then there remains the question of what exactly it is that we are assenting.

0MarsColony_in10years
I'm not sure how coherent the second is. I would tend to think that our beliefs and our actions would converge, if you took the limit as wisdom approached infinity. Perhaps there's no guarantee, but it seems like we would have to suffer quite a lot of cognitive dissonance in order to fully accept all parts of an infinitely wise argument that something should be done, while still doing nothing. Even just thinking and accepting such arguments is doing something. Why think in the first place? Perhaps I'm missing something, but if I condition on the fact that such an infinitely compelling argument exists, it seems overwhelmingly likely that anyone with the values being appealed to would be strongly compelled to act. Well, at least once they had time to process the arguments and let all the details sink in. Perhaps there would be people with such radically twisted worldviews that they would have a nervous breakdown first, and some might go into total denial and never accept such an argument. (For example, if they are stuck in a happy death spiral that is a local minimum of cognitive dissonance, and also requires disbelief in evidence and arguments, thus making even a globally minimum in cognitive dissonance unappealing.) But the desire for internal consistency is a strong value in humans, so I would think that the need to drive down cognitive dissonance would eventually win out in all practical cases, given sufficient time.
1hairyfigment
I should mention that even humans can make a moral judgment without being compelled to follow it. This seems to some extent like a case of the brain not working properly, but it establishes the trick is possible even for a somewhat human-like mind.

I agree that it needs a summary. But I think it wiser to write first and summarize afterward - otherwise I am never quite sure what there is to summarize.

There needs to be a separate word for that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing; this subset is often also called "morality" but that would be confusing.

7ahbwramc
I'm not entirely sure why, but this comment was inordinately helpful in doing away with the last vestiges of confusion about your metaethics. I don't know what I thought before reading it - of course morality would be a subset of our values, what else could it be? But somehow it made everything jump into place. I think I can now say (two years after first reading the sequence, and only through a long and gradual process) that I agree with your metaethical theory.

There needs to be a separate word for that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing; this subset is often also called "morality" but that would be confusing.

Are you maybe referring to manners/etiquette/propriety?

Eliezer: This actually kinda sounds (almost) like something I'd been thinking for a while, except that your version added one (well, many actually, but the one is one that's useful in getting it to all add back up to normality) "dang, I should have thought" insight.

But I'm not sure if these are equivalent. Is this more or less what you were saying: "When we're talking about 'shouldness', we mean something, or at least we think we mean something. It's not something we can fully explicitly articulate, but if we could somehow fully utterly comp... (read more)

The most important part seems to be missing. You say that shouldness is about actions and consequences. I'm with you there. You say that it is a one-place function. I take that to mean that it encompasses a specific set of values, independent of who is asking. The part that still seems to be missing is: how are we to determine what this set of values is? What if we disagree? In your conclusion, you seem to be saying that the values we are aiming for are self-evident. Are they really?

It so happens that I agree with you about things like happiness, human lif... (read more)

2Will_Lugar
I agree with the concerns of AndyWood and others who have made similar comments, and I'll be paying attention to see whether the later installments of the metaethics sequence have answered them. Before I read them, here is my own summarized set of concerns. (I apologize if responding to a given part of a sequence before reading the later parts is bad form; please let me know if this is the case.) Eliezer seems to assume that any two neurologically normal humans would agree on the right function if they were fully rational and sufficiently informed, citing the psychological unity of humankind as support. But even with the present degree of psychological unity, it seems to me fully possible that people's values could truly diverge in quite a few not-fully-reconcilable ways--although perhaps the divergence would be surprisingly small; I just don't know. This is, I think we mostly agree, an open question for further research to explore. Eliezer's way of viewing morality seems like it would run into trouble if it turns out that two different people really do use two different right functions (such that even their CEVs would diverge from one another). Suppose Bob's right function basically boils down to "does it maximize preference fulfillment?" (or some other utilitarian function) and Sally's right function basically boils down to "does it follow a maxim which can be universally willed by a rational agent?" (or some other deontological function). Suppose Bob and Sally are committed to these functions even though each person is fully rational and sufficiently informed--which does not seem implausible. In this case, the fact that each of them is using a one-place function is of no help, because they are using different one-place functions. Eliezer would then have no immediately obvious legitimate way to claim that his right function is the truer or better one. To use a more extreme example: What if the Nazis were completely right, according to their own right function?

Eliezer, it's a pleasure to see you arrive at this point. With an effective understanding of the subjective/objective aspects supporting a realistic metaethics, I look forward to your continued progress and contributions in terms of the dynamics of increasingly effective evolutionary (in the broadest sense) development for meaningful growth, promoting a model of(subjective) fine-grained, hierarchical values with increasing coherence over increasing context of meaning-making, implemts principles of (objective) instrumental action increasingly effective ove... (read more)

I second Robin's request that you summarize your positions. It helps other folks organize and think about your ideas.

I'm quite convinced about how you analyze the problem of what morality is and how we should think about it, up until the point about how universally it applies. I'm just not sure that 'humans different shards of god shatter' add up to the same thing across people, a point that I think would become apparent as soon as you started to specify what the huge computation actually WAS.

I would think of the output as not being a yes/no answer, but something akin to 'What percentage of human beings would agree that this was a good outcome, or be able to be thus conv... (read more)

[-]No10

2+2=4 no matter who's measuring. Right, for myself and my family, and right, for you and yours, may not always be the same.

If the child on the tracks were a bully who had been torturing my own child (which actions I had previously been powerless to prevent by any acceptable means afforded by my society, and assuming I had exhausted all reasonable alternatives), it might very well feel right to let the bully be annihilated by the locomotive.

Right is reducible as an aggregation of sympathetic conditioning; affection for a person, attachment to conceptualization or expected or desired course of events, and so on.

Wow, there's a lot of ground to cover. For everyone who hasn't read Eliezer's previous writings, he talks about something very similar in Creating Friendly Artificial Intelligence, all the way back in 2001 (link = http://www.singinst.org/upload/CFAI/design/structure/external.html). With reference to Andy Wood's comment:

"What claim could any person or group have to landing closer to the one-place function?"

Next obvious question: For purposes of Friendly AI, and for correcting mistaken intuitions, how do we approximate the rightness function? How d... (read more)

Bravo. But:

Because when I say "that big function", and you say "right", we are dereferencing two different pointers to the same unverbalizable abstract computation.

No, the other person is dereferencing a pointer to their big function, which may or may not be the same as yours. This is the one place it doesn't add up to normality: not everyone need have the same function. Eliezer-rightness is objective, a one-place function, but it seems to me the ordinary usage of "right" goes further: it's assumed that everybody means the same thing by, not just "Eliezer-right", but "right". I don't see how this metamorality allows for that, or how any sensible one could. (Not that it bothers me.)

3VAuroch
I believe Elizer is asserting here that "right"=CEV("Eliezer-right"); it is an extrapolation of "Eliezer-right" to "Eliezer_{perfectly-rational}-right". And he asserts, with some justification, that CEV("[Person]-right")="right" for *nearly all values of [Person]. EDIT:Obviously [Sociopath] does not fit this. s/all/nearly all/g
4ialdabaoth
Naive question: by these definitions, [sadistic sociopath] ⊆ [Person]?
2VAuroch
No, it isn't. Corrected my above comment to reflect that. I'd consider sociopathy a degenerate case where morality as we understand it does not have any meaningful role in the decision-making process and never has. Any morality which asks to universalize to all humans, including sociopaths, is likely to be pure preference.
6ialdabaoth
nod Follow-up question: What is the likelihood that different modal forms of morality are fundamental? I.e., suppose the dichotomy presented by George Lakoff's Moral Politics turns out to describe fundamental local maxima in morality-space, which human minds can imperfectly embody. To math this up a little, suppose the CEV of Morality Attractor 1 computes to "maximize the absolute median QALY", while the CEV of Morality Attractor 2 computes to "maximize the the amount by which the median QALY of my in-group exceeds the median QALY of all sapiences as a whole", and neither of those attractors have any particularly universal mathematical reason to favor them. Then, however the FAI searches the Morality domain for a CEV, it is equally likely to settle on some starry-eyed global Uplift as it is to produce nigh-infinite destitute subjects suffering indescribable anguish and despair so that a few Uplifted utility monsters have enough necks to rest their boots on. And before anyone objects that Morality Attractor 2 is too appalling for anyone to seriously advocate, note that it has been the default behavior of most civilized societies for the majority of human history, so it must have SOMETHING going for it. Maybe Morality Attractor 1 just seems more accessible because it's the one advocated by the mother culture that raised us, not because it's actually what most humans tend towards as IQ/g/whatever approaches infinity.
1VAuroch
First, I think you are significantly off-base in your contention that Morality Attractor 2 has been implemented in most civilized societies. What you see as Attractor 2 I think is better explained by Attractor 2a: "Maximize the absolute median QALY of the in-group.", and that Clever Arguers throughout history have appealed to this desire by pointing out how much better off the in-group was, compared to a specific acceptable-target outgroup. It is, after all, much easier to provide a relative QALY surplus than an absolute QALY surplus, and our corrupted hardware is not very good at distinguishing the two. As anecdotal evidence, I consider my own morality significantly closer to 2a than 1, but definitely not similar to 2. I would further say that it seems unlikely that the basic moral impulse is actually restricted to an arbitrary ingroup. One of those 'perfect information' aspects inherent in defining the output of the CEV would be knowing the life story of every person on the planet. Which is, if my knowledge of psychology is correct, basically an express ticket into the moral ingroup. This is why the single-child quarter-donation signs work, when appeals to the huge number of children suffering from don't. So overall, I don't find that suggestion plausible. Someone with human-typical psychology who knew every person in existence as well as we know our friends, which is basically the postulated mind whose utility function is the CEV, would inherently value all their QALY.
2A1987dM
The mentions of "neurological damage 1" and "neurological damage 2" in this comment seem to suggest that EY would consider sadistic sociopaths No True Scotsmen for these purposes.
[-]GBM40

I'm going to need some help with this one.

It seems to me that the argument goes like this, at first:

  • There is a huge blob of computation; it is a 1-place function; it is identical to right.
  • This computation balances various values.
  • Our minds approximate that computation.

Even this little bit creates a lot of questions. I've been following Eliezer's writings for the past little while, although I may well have missed some key point.

Why is this computation a 1-place function? Eliezer says at first "Here we are treating morality as a 1-place functio... (read more)

1TAG
Maybe what's really really right is an idealised form of the Big Blob of Computation. That would be moral realism (or a least species-level relativism). Maybe it isn't and everybody's personal BBoC is where the moral buck stops. That would be subjectivism. Those are two standard positions in metaethics. Nothing has been solved, because we don't know which one is right, and nothing has been dissolved. The traditional problem has just been restated in more sciencey terms.

"You will find yourself saying, "If I wanted to kill someone - even if I thought it was right to kill someone - that wouldn't make it right." Why? Because what is right is a huge computational property- an abstract computation - not tied to the state of anyone's brain, including your own brain."

Coherent Extrapolated Volition (or any roughly similar system) protects against this failure for any specific human, but not in general. Eg., suppose that you use various lawmaking processes to approximate Right(x), and then one person tries to... (read more)

1MarsColony_in10years
This looks correct to me. CEV(my_morality) = CEV(your_morality) = CEV(yudkowsky_morality), because psychologically normal humans all have different extrapolations of the same basic moral fundamentals. We've all been handed the same moral foundation by evolution, unless we are mentally damaged in certain very specific ways. However, CEV(human_morality) ≠ CEV(klingon_morality) ≠ CEV(idiran_morality). There's no reason for morality to be generalizable beyond psychologically normal humans, since any other species would have been handed at least moderately different moral foundations, even if there happened to be some convergent evolution or something.

This little accident of the Gift doesn't seem like a good reason to throw away the Gift We've been "gifted" with impulses to replicate our genes, but many of us elect not to. I'm not as old as Steven Pinker is when he seemingly bragged of it, but I've made no progress toward reproducing and don't have any plans for it in the immediate future, though I could easily donate to a sperm bank. I could engage in all sorts of fitness lowering activities like attending grad-school, becoming a Jainist monk, engaging in amputative body-modification or commi... (read more)

I found this post a lot more enlightening than the posts that it's a followup to.

TGGP, as far as I understand, Arrow's theorem is an artifact of forcing people to send only ordinal information in voting (and enforcing IIA which throws away that information on the strength of preferences between two alternative which is available from rankings relative to third alternatives). People voting strategically isn't an issue either when you're extrapolating them and reading off their opinions.

"alternative" -> "alternatives"

I think lots of people are misunderstanding the "1-place function" bit. It even took me a bit to understand, and I'm familiar with the functional programming roots of the analogy. The idea is that the "1-place morality" is a closure over (i.e. reference to) the 2-place function with arguments "person, situation" that implicitly includes the "person" argument. The 1-place function that you use references yourself. So the "1-place function" is one's subjective morality, and not some objective version. I think that could have been a lot clearer in the post. Not everyone has studied Lisp, Scheme, or Haskell.

Overall I'm a bit disappointed. I thought I was going to learn something. Although you did resolve some confusion I had about the metacircular parts of the reasoning, my conclusions are all the same. Perhaps if I were programming an FAI the explicitness of the argument would be impressive.

As other commenters have brought up, your argument doesn't address how your moral function interacts with others' functions, or how we can go about creating a social, shared morality. Granted, it's a topic for another post (or several) but you could at least acknowledge the issue.

Too much rhetoric (wear black, miracle, etc.), you wandered off the point three too many times, you used incoherent examples, you never actually defended realism, you never defend the assertion of "the big computation", and for that much text there was so little actually said. A poor offering.

This argument sounds too good to be true - when you apply it to your own idea of "right". It also works for, say, a psychopath unable to feel empathy who gets a tremendous kick out of killing. How is there not a problem with that?

0MarsColony_in10years
Well, it isn't the same as a morality that is written into the fabric of the universe or handed down on a stone tablet or something, but it is the "best" we have or could hope to have. (whatever "best" even means, in this case) It evaluates the same (or at least the Coherent Extrapolated Volition converges) for all psychologically healthy humans. But if someone has a damaged mind, or their species simply evolved a different set of values, then they would have their own morality, and you could no more argue human morals into them than you could into a rock. Well, it certainly is a little dissatisfying. It's much better than the nihilistic alternative, though. However, those are coping problems, not problems with the logic itself. If the sky is green then I desire to believe that the sky is green.

that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing

Can we just say that evolution gave most of us such an identifiable subset, and declare a name for that? Even so, a key question remains whether we are mistaken in expecting agreement - are we built to actually agree given enough analysis and discussion, or only to mistakenly expect to agree?

[-]Roko20

I agree with Andy Wood and Nick Tarleton. To put what they have said another way, you have taken the 2-place function

Rightness(person,act)

And replaced it with a certain unspecified unary rightness function which I will call "Eliezer's_big_computation( -- )". You have told us informally that we can approximate

Eliezer's_big_computation( X ) = happiness( X ) + survival( X ) + justice( X ) + individuality( X ) + ...

But others may define other "big computations". For example

God's_big_computation( X ) = submission( X ) + Oppression_of_women(... (read more)

[-]IL00

Let me see if I get this straight:

Our morality is composed of a big computation that includes a list of the things that we value(love, friendship, happiness,...) and a list of valid moral arguments(contagion backward in time, symmetry,...). If so, then how do we discover those lists? I guess that the only way is to reflect on our own minds, but if we do that, then how do we know if a particular value comes from our big computation, or is it just part of our regular biases? And if our biases are inextricably tangled with The Big Computation, then what hope ... (read more)

I second Behemouth and Nick- what do we do in the mindspace in which individual's feelings of right and wrong disagree? What if some people think retarded children absolutely should NOT be pulled off the track? Also, what about the pastrami-sandwich dilemma? (hat of those who would kill 1 million unknown people with no consequence to themselves for a delicious sandwich?

But generally, I loved the post. You should write another post on 'Adding Up to Normality.'

Just because I can't resist, a poem about human failing, the judgment of others we deem weaker than ourselves, and the desire to 'do better.' Can we?

"No Second Troy" WB Yeats, 1916 WHY should I blame her that she filled my days With misery, or that she would of late Have taught to ignorant men most violent ways, Or hurled the little streets upon the great, Had they but courage equal to desire? 5 What could have made her peaceful with a mind That nobleness made simple as a fire, With beauty like a tightened bow, a kind That is not natural in an age like this, Being high and solitary and most stern? 10 Why, what could she have done being what she is? Was there another Troy for her to burn?

[-]IL00

P.S : My great "Aha!" moment from reading this post is the realisation that morality is not just a utility function that maps states of the world to real numbers, but also a set of intuitions for changing that utility function.

Added a section on non-universalizability:

If you hoped that morality would be universalizable - sorry, that one I really can't give back. Well, unless we're just talking about humans. Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence; and a great and reasonable doubt as to whether any present disagreement is really unresolvable, even it seems to be about "values". The obvious reason for hope is the psychological unity of humankind, and the intuitions of symmetry, universalizability, and simplic
... (read more)

Dear Eliezer, First of all, great post, thank you, I truly love you Eli!! It was really the kind of beautiful endpoint in your dance I was waiting for, and it is very much in the lines of my own reasoning, just a lot more detailed. I also think this could be labeled metametamorality, therefore some of the justified complaints does not yet apply. But the people complaining about different moral preferences are doing so with their own morality, what else could they be using, and in doing so they are acting according to the arguments of this post. Metametamor... (read more)

P.S. So my additinon is really, choose a stable value structure that feels right, try to maximize it, try to make it better and change so when you feel it is right. I have my own high-level suggestion of Beauty, Truth and the Good, and I later discovered Plato and a lot of others seem to argue for the same three...

There are some good thoughts here, but I don't think the story is a correct and complete account of metamorality (or as the rest of the world calls it: metaethics). I imagine that there will be more posts on Eliezer's theory later and more opportunities to voice concerns, but for now I just want to take issue with the account of 'shouldness' flowing back through the causal links.

'Shouldness' doesn't always flow backwards in the way Eliezer mentioned. e.g. Suppose that in order to push the button, you need to shoot someone who will fall down on it. This wou... (read more)

On reflection, there should be a separate name for the space of arguments that change our terminal values. Using "metaethics" to indicate beliefs about the nature of (ontology of) morality, would free up "metamorals" to indicate those arguments that change our terminal values. So I bow to Zubon and standard usage - even though it still sounds wrong to me.

Toby, the case of needing to shoot someone who will fall down on the button, is of course very easy for a consequentialist to handle; wrongness flows backward from the shooting, as ri... (read more)

As I've stated before, we are all morally obliged to prevent Eliezer from programming an AI. For according to this system, he is morally obliged to make his AI instantiate his personal morality. But it is quite impossible that the complicated calculation in Eliezer's brain should be exactly the same as the one in any of us: and so by our standards, Eliezer's morality is immoral. And this opinion is subjectively objective, i.e. his morality is immoral and would be even if all of us disagreed. So we are all morally obliged to prevent him from inflicting his immoral AI on us.

5[anonymous]
This is a really, really hasty non-sequitur. Eliezer's morality is probably extremely similar to mine; thus, the world be a much, much better place, even according to my specification, with an AI running Eliezer's morality as opposed no AI running at all (or, worse, a paperclip maximizer). Eliezer's morality is absolutely not immoral; it's my morality +- 1% error, as opposed to some other nonhuman goal structure which would be unimaginably bad on my scale.

Suggested summary: "There is nothing else." That is the key sentence. After much discussion of morals and metas, it comes down to: "You go on with the same morals as before, and the same moral arguments as before." The insight offered is that there is no deeper insight to offer. The recursion will bottom out, so bite the bullet and move on.

Yet another agreement on the 1-Place and 2-Place problem, and I read it after the addition. CEV goes around most of that for neurologically intact humans, but the principle of "no universall... (read more)

wrongness flows backward from the shooting, as rightness flows backward from the button, and the wrongness outweighs the rightness.

I suppose you could say this, but if I understand you correctly, then it goes against common usage. Usually those who study ethics would say that rightness is not the type of thing that can add with wrongness to get net wrongness (or net rightness for that matter). That is, if they were talking about that kind of thing, they wouldn't use the word 'rightness'. The same goes for 'should' or 'ought'. Terms used for this kind of st... (read more)

[-]Roko00

Eliezer: "if you were stepping outside the human and hoping for moral arguments that would persuade any possible mind, even a mind that just wanted to maximize the number of paperclips in the universe, then sorry - the space of possible mind designs is too large to permit universally compelling arguments."

After thinking more about it, I might be wrong: actually the calculation might end up giving the same result for every human being.

Caledonian: what kind of motivations do you have?

Okay, for the future I'll just delete the content-free parts of Caledonian's posts, like those above. There do seem to be many readers who would prefer that he not be banned outright. But given the otherwise high quality of the comments on Overcoming Bias, I really don't think it's a good idea to let him go on throwing up on the blog.

Watching the ensuing commentary, I'm drawn to wishfully imagine a highly advanced Musashi, wielding his high-dimensional blade of rationality such that in one stroke he delineates and separates the surrounding confusion from the nascent clarity. Of course no such vorpal katana could exist, for if it did, it would serve only to better clear the way for its successors.

I see a preponderance of viewpoints representing, in effect, the belief that "this is all well and good, but how will this guide me to the one true prior, from which Archimedian point one ... (read more)

Caledonian: uh... he didn't say you couldn't make arguments about all possible minds, he was saying you couldn't construct an argument that's so persuasive, so convincing that every possible mind, no matter how unusual its nature, would automatically be convinced by that argument.

It's not a matter of talking about minds, it's a matter of talking to minds.

Mathematicians figure out things about sets. But they're not trying to convince the sets themselves about those things. :)

[+]ME3-60

Roko: You think you can convince a paperclip maximizer to value human life? Or do you think paperclip maximizers are impossible?

Eliezer: It's because when I say right, I am referring to a 1-place function

Like many others, I fall over at this point. I understand that Morality_8472 has a definite meaning, and therefore it's a matter of objective fact whether any act is right or wrong according to that morality. The problem is why we should choose it over Morality_11283.

Of course you can say, "according to Morality_8472, Morality_8472 is correct" but that's hardly helpful.

Ultimately, I think you've given us another type of anti-realist relativism.

Eliezer: But if you were ste... (read more)

Caledonian: He isn't using "too-big" in the way you are interpreting it.

The point is not: Mindspace has a size X, X > Y, and any set of minds of size > Y cannot admit universal arguments.

The point is: For any putative universal argument you can cook up, I can cook up a mind design that isn't convinced by it.

The reason that we say it is too big is because there are subsets of Mindspace that do admit universally compelling arguments, such as (we hope) neurologically intact humans.