Is this a fair summary?
The answer to the clever meta-moral question, “But why should we care about morality?” is just “Because when we say morality, we refer to that-which-we-care-about - and, not to belabor the point, but we care about what we care about. Whatever you think you care about, which isn’t morality, I’m calling that morality also. Precisely which things are moral and which are not is a difficult question - but there is no non-trivial meta-question.”
There is a non-trivial point in this summary, which is the meaning of "we." I could imagine a possible world in which the moral intuitions of humans diverge widely enough that there isn't anything that could reasonably be called a coherent extrapolated volition of humanity (and I worry that I already live there).
If we reprogrammed you to count paperclips instead, it wouldn't feel like different things having the same kind of motivation behind it. It wouldn't feel like doing-what's-right for a different guess about what's right. It would feel like doing-what-leads-to-paperclips.
Um, how do you know?
I think you and Alicorn may be talking past each other somewhat.
Throughout my life, it seems that what I morally value has varied more than what rightness feels like - just as it seems that what I consider status-raising has changed more than what rising in status feels like, and what I find physically pleasurable has changed more than what physical pleasures feel like. It's possible that the things my whole person is optimizing for have not changed at all, that my subjective feelings are a direct reflection of this, and that my evaluation of a change of content is merely a change in my causal model of the production of the desiderata (I thought voting for Smith would lower unemployment, but now I think voting for Jones would, etc.) But it seems more plausible to me that
1) the whole me is optimizing for various things, and these things change over time,
2) and that the conscious me is getting information inputs which it can group together by family resemblance, and which can reinforce or disincentivize its behavior.
Imagine a ship which is governed by an anarchic assembly beneath board and captained by an employee of theirs whom they motivate through in-kind bonuses. So the assembly...
This comment expands how you'd go about reprogramming someone in this way with another layer of granularity, which is certainly interesting on its own merits, but it doesn't strongly support your assertion about what it would feel like to be that someone. What makes you think this is how qualia work? Have you been performing sinister experiments in your basement? Do you have magic counterfactual-luminosity-powers?
I think Eliezer is simply suggesting that qualia don't in fact exist in a vacuum. Green feels the way it does partly because it's the color of chlorophyll. In a universe where plants had picked a different color for chlorophyll (melanophyll, say), with everything else (per impossibile) held constant, we would associate an at least slightly different quale with green and with black, because part of how colors feel is that they subtly remind us of the things that are most often colored that way. Similarly, part of how 'goodness' feels is that it imperceptibly reminds us of the extension of good; if that extension were dramatically different, then the feeling would (barring any radical redesigns of how associative thought works) be different too. In a universe where the smallest birds were ten feet tall, thinking about 'birdiness' would involve a different quale for the same reason.
Consider Bob. Bob, like most unreflective people, settles many moral questions by "am I disgusted by it?" Bob is disgusted by, among other things, feces, rotten fruit, corpses, maggots, and men kissing men. Internally, it feels to Bob like the disgust he feels at one of those stimuli is the same as the disgust he feels at the other stimuli, and brain scans show that they all activate the insula in basically the same way.
Bob goes through aversion therapy (or some other method) and eventually his insula no longer activates when he sees men kissing men.
When Bob remembers his previous reaction to that stimuli, I imagine he would remember being disgusted, but not be disgusted when he remembers the stimuli. His positions on, say, same-sex marriage or the acceptability of gay relationships have changed, and he is aware that they have changed.
Do you think this example agrees with your account? If/where it disagrees, why do you prefer your account?
I think this is really a sorites problem. If you change what's delicious only slightly, then deliciousness itself seems to be unaltered. But if you change it radically — say, if circuits similar to your old gustatory ones now trigger when and only when you see a bright light — then it seems plausible that the experience itself will be at least somewhat changed, because 'how things feel' is affected by our whole web of perceptual and conceptual associations. There isn't necessarily any sharp line where a change in deliciousness itself suddenly becomes perceptible; but it's nevertheless the case that the overall extension of 'delicious' (like 'disgusting' and 'moral') has some effect on how we experience deliciousness. E.g., deliciousness feels more foodish than lightish.
Speaking from personal experience, I can say that he's right.
So, you introspect the way that he introspects. Do all humans? Would all humans need to introspect that way for it to do the work that he wants it to do?
The other side of this is that I would expect my brain to NOTICE it's actual goals. If my goal is to make paperclips, I will think "I should do this because it makes paperclips", instead of "I should do this because it makes people happy". My brain doesn't have a generic "I should do this" emotion, as near as I can tell - it just has ways of signalling that an activity will accomplish my goals.
Iron deficiency feels like wanting ice. For clever, verbal reasons. Not being iron deficient doesn't feel like anything. My brain did not notice that it was trying to get iron - it didn't even notice it was trying to get ice, it made up reasons according to which ice was an instrumental value for some terminal goal or other.
The standard religious reply to the baby-slaughter dilemma goes something like this:
Sure, if G-d commanded us to slaughter babies, then killing babies would be good. And if "2+2=3" was a theorem of PA, then "2+2=3" would be true. But G-d logically cannot command us to do a bad thing, anymore than PA can prove something that doesn't follow from its axioms. (We use "omnipotent" to mean "really really powerful", not "actually omnipotent" which isn't even a coherent concept. G-d can't make a stone so heavy he can't lift it, draw a square circle, or be evil.) Religion has destroyed my humanity exactly as much as studying arithmetic has destroyed your numeracy. (Please pay no attention to the parts of the Bible where G-d commands exactly that.)
It does choose a horn, but it's the other one, "things are moral because G-d commands them". It just denies the connotation that there exists a possible Counterfactual!G-d which could decide that Real!evil things are Counterfactual!good; in all possible worlds, G-d either wants the same thing or is something different mistakenly called "G-d". (Yeah, there's a possible world where we're ruled by an entity who pretends to be G-d and so we believe that we should kill babies. And there's a possible world where you're hallucinating this conversation.)
Or you could say it claims equivalence. Is this road sign a triangle because it has three sides, or does it have three sides because it is a triangle? If you pick the latter, does that mean that if triangles had four sides, the sign would change shape to have four sides? If you pick the former, does that mean that I can have three sides without being a triangle? (I don't think this one is quite fair, because we can imagine a powerful creator who wants immoral things.)
Three possible responses to the atheist response:
Sure. Not believing has bad consequences - you're wrong as a matter of fact, you don't get special believ
Obvious further atheist reply to the denial of counterfactuals: If God's desires don't vary across possible worlds there exists a logical abstraction which only describes the structure of the desires and doesn't make mention of God, just like if multiplication-of-apples doesn't vary across possible worlds, we can strip out the apples and talk about the multiplication.
I don't think it's incompatible. You're supposed to really trust the guy because he's literally made of morality, so if he tells you something that sounds immoral (and you're not, like, psychotic) of course you assume that it's moral and the error is on your side. Most of the time you don't get direct exceptional divine commands, so you don't want to kill any kids. Wouldn't you kill the kid if an AI you knew to be Friendly, smart, and well-informed told you "I can't tell you why right now, but it's really important that you kill that kid"?
If your objection is that Mr. Orders-multiple-genocides hasn't shown that kind of evidence he's morally good, well, I got nuthin'.
You're supposed to really trust the guy because he's literally made of morality, so if he tells you something that sounds immoral (and you're not, like, psychotic) of course you assume that it's moral and the error is on your side.
What we have is an inconsistent set of four assertions:
At least one of these has to be rejected. Abraham (provisionally) rejects 1; once God announces 'J/K,' he updates in favor of rejecting 2, on the grounds that God didn't really want him to kill his son, though the Voice really was God.
The problem with this is that rejecting 1 assumes that my confidence in my foundational moral principles (e.g., 'thou shalt not murder, self!') is weaker than my confidence in the conjunction of:
But it's hard to believe that I'm more confident in the divinity of a certain class of Voices than in my moral axioms, especially if my confidenc...
Well, if we're shifting from our idealized post-Protestant-Reformation Abraham to the original Abraham-of-Genesis folk hero, then we should probably bracket all this Medieval talk about God's omnibenevolence and omnipotence. The Yahweh of Genesis is described as being unable to do certain things, as lacking certain items of knowledge, and as making mistakes. Shall not the judge of all the Earth do right?
As Genesis presents the story, the relevant question doesn't seem to be 'Does my moral obligation to obey God outweigh my moral obligation to protect my son?' Nor is it 'Does my confidence in my moral intuitions outweigh my confidence in God's moral intuitions plus my understanding of God's commands?' Rather, the question is: 'Do I care more about obeying God than about my most beloved possession?' Notice there's nothing moral at stake here at all; it's purely a question of weighing loyalties and desires, of weighing the amount I trust God's promises and respect God's authority against the amount of utility (love, happiness) I assign to my son.
The moral rights of the son, and the duties of the father, are not on the table; what's at issue is whether Abraham's such a good soldier-servant that he's willing to give up his most cherished possessions (which just happen to be sentient persons). Replace 'God' with 'Satan' and you get the same fealty calculation on Abraham's part, since God's authority, power, and honesty, not his beneficence, are what Abraham has faith in.
To my knowledge, this is a common theory, although I don't know whether it's standard. There are a number of references in the Tanakh to human sacrifice, and even if the early Jews didn't practice (and had no cultural memory of having once practiced) human sacrifice, its presence as a known phenomenon in the Levant could have motivated the story. I can imagine several reasons:
(a) The writer was worried about human sacrifice, and wanted a narrative basis for forbidding it.
(b) The writer wasn't worried about actual human sacrifice, but wanted to clearly distinguish his community from Those People who do child sacrifice.
(c) The writer didn't just want to show a difference between Jews and human-sacrifice groups, but wanted to show that Jews were at least as badass. Being willing to sacrifice humans is an especially striking and impressive sign of devotion to a deity, so a binding-of-Isaac-style story serves to indicate that the Founding Figure (and, by implicit metonymy, the group as a whole, or its exemplars) is willing to give proof of that level of devotion, but is explicitly not required to do so by the god. This is an obvious win-win -- we don't have to actually kill anybod
Followup to: Mixed Reference: The Great Reductionist Project
Suppose three people find a pie - that is, three people exactly simultaneously spot a pie which has been exogenously generated in unclaimed territory. Zaire wants the entire pie; Yancy thinks that 1/3 each is fair; and Xannon thinks that fair would be taking into equal account everyone's ideas about what is "fair".
I myself would say unhesitatingly that a third of the pie each, is fair. "Fairness", as an ethical concept, can get a lot more complicated in more elaborate contexts. But in this simple context, a lot of other things that "fairness" could depend on, like work inputs, have been eliminated or made constant. Assuming no relevant conditions other than those already stated, "fairness" simplifies to the mathematical procedure of splitting the pie into equal parts; and when this logical function is run over physical reality, it outputs "1/3 for Zaire, 1/3 for Yancy, 1/3 for Xannon".
Or to put it another way - just like we get "If Oswald hadn't shot Kennedy, nobody else would've" by running a logical function over a true causal model - similarly, we can get the hypothetical 'fair' situation, whether or not it actually happens, by running the physical starting scenario through a logical function that describes what a 'fair' outcome would look like:
So am I (as Zaire would claim) just assuming-by-authority that I get to have everything my way, since I'm not defining 'fairness' the way Zaire wants to define it?
No more than mathematicians are flatly ordering everyone to assume-without-proof that two different numbers can't have the same successor. For fairness to be what everyone thinks is "fair" would be entirely circular, structurally isomorphic to "Fzeem is what everyone thinks is fzeem"... or like trying to define the counting numbers as "whatever anyone thinks is a number". It only even looks coherent because everyone secretly already has a mental picture of "numbers" - because their brain already navigated to the referent. But something akin to axioms is needed to talk about "numbers, as opposed to something else" in the first place. Even an inchoate mental image of "0, 1, 2, ..." implies the axioms no less than a formal statement - we can extract the axioms back out by asking questions about this rough mental image.
Similarly, the intuition that fairness has something to do with dividing up the pie equally, plays a role akin to secretly already having "0, 1, 2, ..." in mind as the subject of mathematical conversation. You need axioms, not as assumptions that aren't justified, but as pointers to what the heck the conversation is supposed to be about.
Multiple philosophers have suggested that this stance seems similar to "rigid designation", i.e., when I say 'fair' it intrinsically, rigidly refers to something-to-do-with-equal-division. I confess I don't see it that way myself - if somebody thinks of Euclidean geometry when you utter the sound "num-berz" they're not doing anything false, they're associating the sound to a different logical thingy. It's not about words with intrinsically rigid referential power, it's that the words are window dressing on the underlying entities. I want to talk about a particular logical entity, as it might be defined by either axioms or inchoate images, regardless of which word-sounds may be associated to it. If you want to call that "rigid designation", that seems to me like adding a level of indirection; I don't care about the word 'fair' in the first place, I care about the logical entity of fairness. (Or to put it even more sharply: since my ontology does not have room for physics, logic, plus designation, I'm not very interested in discussing this 'rigid designation' business unless it's being reduced to something else.)
Once issues of justice become more complicated and all the contextual variables get added back in, we might not be sure if a disagreement about 'fairness' reflects:
There's a lot of people who feel that this picture leaves out something fundamental, especially once we make the jump from "fair" to the broader concept of "moral", "good", or "right". And it's this worry about leaving-out-something-fundamental that I hope to address next...
...but please note, if we confess that 'right' lives in a world of physics and logic - because everything lives in a world of physics and logic - then we have to translate 'right' into those terms somehow.
And that is the answer Susan should have given - if she could talk about sufficiently advanced epistemology, sufficiently fast - to Death's entire statement:
You think so? Then take the universe and grind it down to the finest powder and sieve it through the finest sieve and then show me one atom of justice, one molecule of mercy. And yet — Death waved a hand. And yet you act as if there is some ideal order in the world, as if there is some ... rightness in the universe by which it may be judged.
"But!" Susan should've said. "When we judge the universe we're comparing it to a logical referent, a sort of thing that isn't in the universe! Why, it's just like looking at a heap of 2 apples and a heap of 3 apples on a table, and comparing their invisible product to the number 6 - there isn't any 6 if you grind up the whole table, even if you grind up the whole universe, but the product is still 6, physico-logically speaking."
If you require that Rightness be written on some particular great Stone Tablet somewhere - to be "a light that shines from the sky", outside people, as a different Terry Pratchett book put it - then indeed, there's no such Stone Tablet anywhere in our universe.
But there shouldn't be such a Stone Tablet, given standard intuitions about morality. This follows from the Euthryphro Dilemma out of ancient Greece.
The original Euthryphro dilemma goes, "Is it pious because it is loved by the gods, or loved by the gods because it is pious?" The religious version goes, "Is it good because it is commanded by God, or does God command it because it is good?"
The standard atheist reply is: "Would you say that it's an intrinsically good thing - even if the event has no further causal consequences which are good - to slaughter babies or torture people, if that's what God says to do?"
If we can't make it good to slaughter babies by tweaking the state of God, then morality doesn't come from God; so goes the standard atheist argument.
But if you can't make it good to slaughter babies by tweaking the physical state of anything - if we can't imagine a world where some great Stone Tablet of Morality has been physically rewritten, and what is right has changed - then this is telling us that...
(drumroll)
...what's "right" is a logical thingy rather than a physical thingy, that's all. The mark of a logical validity is that we can't concretely visualize a coherent possible world where the proposition is false.
And I mention this in hopes that I can show that it is not moral anti-realism to say that moral statements take their truth-value from logical entities. Even in Ancient Greece, philosophers implicitly knew that 'morality' ought to be such an entity - that it couldn't be something you found when you ground the Universe to powder, because then you could resprinkle the powder and make it wonderful to kill babies - though they didn't know how to say what they knew.
There's a lot of people who still feel that Death would be right, if the universe were all physical; that the kind of dry logical entity I'm describing here, isn't sufficient to carry the bright alive feeling of goodness.
And there are others who accept that physics and logic is everything, but who - I think mistakenly - go ahead and also accept Death's stance that this makes morality a lie, or, in lesser form, that the bright alive feeling can't make it. (Sort of like people who accept an incompatibilist theory of free will, also accept physics, and conclude with sorrow that they are indeed being controlled by physics.)
In case anyone is bored that I'm still trying to fight this battle, well, here's a quote from a recent Facebook conversation with a famous early transhumanist:
It would actually be quite surprisingly helpful for increasing the percentage of people who will participate meaningfully in saving the planet, if there were some reliably-working standard explanation for why physics and logic together have enough room to contain morality. People who think that reductionism means we have to lie to our children, as Pratchett's Death advocates, won't be much enthused about the Center for Applied Rationality. And there are a fair number of people out there who still advocate proceeding in the confidence of ineffable morality to construct sloppily designed AIs.
So far I don't know of any exposition that works reliably - for the thesis for how morality including our intuitions about whether things really are justified and so on, is preserved in the analysis to physics plus logic; that morality has been explained rather than explained away. Nonetheless I shall now take another stab at it, starting with a simpler bright feeling:
When I see an unusually neat mathematical proof, unexpectedly short or surprisingly general, my brain gets a joyous sense of elegance.
There's presumably some functional slice through my brain that implements this emotion - some configuration subspace of spiking neural circuitry which corresponds to my feeling of elegance. Perhaps I should say that elegance is merely about my brain switching on its elegance-signal? But there are concepts like Kolmogorov complexity that give more formal meanings of "simple" than "Simple is whatever makes my brain feel the emotion of simplicity." Anything you do to fool my brain wouldn't make the proof really elegant, not in that sense. The emotion is not free of semantic content; we could build a correspondence theory for it and navigate to its logical+physical referent, and say: "Sarah feels like this proof is elegant, and her feeling is true." You could even say that certain proofs are elegant even if no conscious agent sees them.
My description of 'elegance' admittedly did invoke agent-dependent concepts like 'unexpectedly' short or 'surprisingly' general. It's almost certainly true that with a different mathematical background, I would have different standards of elegance and experience that feeling on somewhat different occasions. Even so, that still seems like moving around in a field of similar referents for the emotion - much more similar to each other than to, say, the distant cluster of 'anger'.
Rewiring my brain so that the 'elegance' sensation gets activated when I see mathematical proofs where the words have lots of vowels - that wouldn't change what is elegant. Rather, it would make the feeling be about something else entirely; different semantics with a different truth-condition.
Indeed, it's not clear that this thought experiment is, or should be, really conceivable. If all the associated computation is about vowels instead of elegance, then from the inside you would expect that to feel vowelly, not feel elegant...
...which is to say that even feelings can be associated with logical entities. Though unfortunately not in any way that will feel like qualia if you can't read your own source code. I could write out an exact description of your visual cortex's spiking code for 'blue' on paper, and it wouldn't actually look blue to you. Still, on the higher level of description, it should seem intuitively plausible that if you tried rewriting the relevant part of your brain to count vowels, the resulting sensation would no longer have the content or even the feeling of elegance. It would compute vowelliness, and feel vowelly.
My feeling of mathematical elegance is motivating; it makes me more likely to search for similar such proofs later and go on doing math. You could construct an agent that tried to add more vowels instead, and if the agent asked itself why it was doing that, the resulting justification-thought wouldn't feel like because-it's-elegant, it would feel like because-it's-vowelly.
In the same sense, when you try to do what's right, you're motivated by things like (to yet again quote Frankena's list of terminal values):
If we reprogrammed you to count paperclips instead, it wouldn't feel like different things having the same kind of motivation behind it. It wouldn't feel like doing-what's-right for a different guess about what's right. It would feel like doing-what-leads-to-paperclips.
And I quoted the above list because the feeling of rightness isn't about implementing a particular logical function; it contains no mention of logical functions at all; in the environment of evolutionary ancestry nobody has heard of axiomatization; these feelings are about life, consciousness, etcetera. If I could write out the whole truth-condition of the feeling in a way you could compute, you would still feel Moore's Open Question: "I can see that this event is high-rated by logical function X, but is X really right?" - since you can't read your own source code and the description wouldn't be commensurate with your brain's native format.
"But!" you cry. "But, is it really better to do what's right, than to maximize paperclips?" Yes! As soon as you start trying to cash out the logical function that gives betterness its truth-value, it will output "life, consciousness, etc. >B paperclips". And if your brain were computing a different logical function instead, like makes-more-paperclips, it wouldn't feel better, it would feel moreclippy.
But is it really justified to keep our own sense of betterness? Sure, and that's a logical fact - it's the objective output of the logical function corresponding to your experiential sense of what it means for something to be 'justified' in the first place. This doesn't mean that Clippy the Paperclip Maximizer will self-modify to do only things that are justified; Clippy doesn't judge between self-modifications by computing justifications, but rather, computing clippyflurphs.
But isn't it arbitrary for Clippy to maximize paperclips? Indeed; once you implicitly or explicitly pinpoint the logical function that gives judgments of arbitrariness their truth-value - presumably, revolving around the presence or absence of justifications - then this logical function will objectively yield that there's no justification whatsoever for maximizing paperclips (which is why I'm not going to do it) and hence that Clippy's decision is arbitrary. Conversely, Clippy finds that there's no clippyflurph for preserving life, and hence that it is unclipperiffic. But unclipperifficness isn't arbitrariness any more than the number 17 is a right triangle; they're different logical entities pinned down by different axioms, and the corresponding judgments will have different semantic content and feel different. If Clippy is architected to experience that-which-you-call-qualia, Clippy's feeling of clippyflurph will be structurally different from the way justification feels, not just red versus blue, but vision versus sound.
But surely one shouldn't praise the clippyflurphers rather than the just? I quite agree; and as soon as you navigate referentially to the coherent logical entity that is the truth-condition of should - a function on potential actions and future states - it will agree with you that it's better to avoid the arbitrary than the unclipperiffic. Unfortunately, this logical fact does not correspond to the truth-condition of any meaningful proposition computed by Clippy in the course of how it efficiently transforms the universe into paperclips, in much the same way that rightness plays no role in that-which-is-maximized by the blind processes of natural selection.
Where moral judgment is concerned, it's logic all the way down. ALL the way down. Any frame of reference where you're worried that it's really no better to do what's right then to maximize paperclips... well, that really part has a truth-condition (or what does the "really" mean?) and as soon as you write out the truth-condition you're going to end up with yet another ordering over actions or algorithms or meta-algorithms or something. And since grinding up the universe won't and shouldn't yield any miniature '>' tokens, it must be a logical ordering. And so whatever logical ordering it is you're worried about, it probably does produce 'life > paperclips' - but Clippy isn't computing that logical fact any more than your pocket calculator is computing it.
Logical facts have no power to directly affect the universe except when some part of the universe is computing them, and morality is (and should be) logic, not physics.
Which is to say:
Part of the sequence Highly Advanced Epistemology 101 for Beginners
Next post: "Standard and Nonstandard Numbers"
Previous post: "Mixed Reference: The Great Reductionist Project"