Meta-Honesty: Firming Up Honesty Around Its Edge-Cases

Eliezer Yudkowsky

Meta-Honesty: Firming Up Honesty Around Its Edge-Cases

0: Tl;dr.

A problem with the obvious-seeming "wizard's code of honesty" aka "never say things that are false" is that it draws on high verbal intelligence and unusually permissive social embeddings. I.e., you can't always say "Fine" to "How are you?" This has always made me feel very uncomfortable about the privilege implicit in recommending that anyone else be more honest.
Genuinely consistent Glomarization (i.e., consistently saying "I cannot confirm or deny" whether or not there's anything to conceal) does not work in principle because there are too many counterfactual selves who might want to conceal something.
Glomarization also doesn't work in practice if the Nazis show up at your door asking if you have fugitive Jews in your attic.
If you would lie to Nazis about fugitive Jews, then absolute truthsaying can't be the whole story, which makes "never say things that are false" feel to me like a shaky foundation in that it is literally false, and something less shaky would be nice.
Robin Hanson's "automatic norms" problem suggests different people might have very different ideas about what constitutes a good person's normal honesty, without realizing that they have very different ideas. Perceived violations of an honesty norm can blow up and cause interpersonal conflict. It seems to me that this is something that doesn't always work well when people leave it alone.

A rule which seems to me more "normal" than the wizard's literal-truth rule, more like a version of standard human honesty reinforced around the edges, would be as follows:

"Don't lie when a normal highly honest person wouldn't, and furthermore, be honest when somebody asks you which hypothetical circumstances would cause you to lie or mislead—absolutely honest, if they ask under this code. However, questions about meta-honesty should be careful not to probe object-level information."

I've been tentatively calling this "meta-honesty", but better terminology is solicited.

1: Glomarization can't practically cover many cases.

Suppose that last night I helped hide a fugitive marijuana seller from the Feds. You ask me what I was doing last night, and I, preferring not to emit false statements, reply, "I can't confirm or deny what I was doing last night."

We now have two major problems here:

Even on an ordinary day, if you casually ask me what I was doing last night, I theoretically ought to answer "I can't confirm or deny what I was doing last night" because some of my counterfactual selves were hiding fugitive marijuana sellers from the Feds. If I don't do this consistently, and I actually was hiding fugitives last night, I can't Glomarize without revealing information. But then the number of counterfactuals I have to worry about is too large for me to ever answer anything.
If the Feds actually ask you this question, they will not be familiar with your previous practice of Glomarization and will probably not be very impressed with your answer.

This doesn't mean that Glomarization is never helpful. If you ask me whether my submarine is carrying nuclear weapons, or whether I'm secretly the author of "The Waves Arisen", I think most listeners would understand if I replied, "I have a consistent policy of not saying which submarines are carrying nuclear weapons, nor whether I wrote or helped write a document that doesn't have my name on it." An ordinary honest person does not need to lie on these occasions because Glomarization is both theoretically possible and pragmatically practical, so one should adopt a consistent Glomarization rather than lie.

But that doesn't work for hiding fugitives. Or any other occasion where an ordinary high-honesty person would consider it obligatory to lie, in answer to a question where the asker is not expecting evasion or Glomarization.

(I'm sure some people reading this think it's all very cute for me to be worried about the fact that I wouldn't tell the truth all the time. Feel free to state this in the comments so that we aren't confused about who's using which norms. Smirking about it, or laughing, especially conveys important info about you.)

2: The law of no literal falsehood.

One formulation of my automatic norm for honesty, the one that feels like the obvious default from which any departure requires a crushingly heavy justification, was given by Ursula K. LeGuin in A Wizard of Earthsea:

He told his tale, and one man said, "But who saw this wonder of dragons slain and dragons baffled? What if he—"

"Be still!" the Head Isle-Man said roughly, for he knew, as did most of them, that a wizard may have subtle ways of telling the truth, and may keep the truth to himself, but that if he says a thing the thing is as he says. For that is his mastery.

Or in simpler summary, this policy says:

Don't say things that are literally false.

Or with some of the unspoken finicky details added back in: "Don't say things that you believe to be literally false in a context where people will (with reasonably high probability) persistently believe that you believe them to be true." Jokes are still allowed, even jokes that only get revealed as jokes ten seconds later. Or quotations, etcetera ad obviousum.

The no-literal-falsehood code of honesty has three huge advantages:

To the extent people observe you to consistently practice it, it is easier for you to communicate believably when you want to say a thing. They may still not be able to trust you perfectly, but the hypothetical is "Did this person break their big-deal code of honesty?" rather than "Did this person tell an ordinary lie?" One would hope this would be good for coordination and other interpersonal issues, though this might only be a fond wish on my part.
Most people, even most unusually honest people, wander about their lives in a fog of internal distortions of reality. Repeatedly asking yourself of every sentence you say aloud to another person, "Is this statement actually and literally true?", helps you build a skill for navigating out of your internal smog of not-quite-truths. For that is our mastery.
It's good for your soul. At least, it's good for my soul for reasons I'd expect to generalize if I'm not just committing the typical-mind fallacy.

From Frank Hebert's Dune Messiah, writing about Truthsayers, people who had trained to extreme heights the ability to tell when others were lying and who also never lied themselves:

"It requires that you have an inner agreement with truth which allows ready recognition."

This is probably not true in normal human practice for detecting other people's lies. I'd expect a lot of con artists are better than a lot of honest people at that.

But the phrase "It requires you have an inner agreement with truth which allows ready recognition" is something that resonates strongly with me. It feels like it points to the part that's good for your soul. Saying only true things is a kind of respect for the truth, a pact that you forge with it.

3: The privilege of truthtelling.

I've never suggested to anyone else that they adopt the wizard's code of honesty.

The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true. (Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)

So I haven't advocated any particular code of honesty before now. I was aware of the fact that I had an unusually high verbal SAT score, and also, that I spend little time interfacing with mundanes and am not dependent on them for my daily bread. I thought it wasn't my place for me to suggest to anyone else that they try their hand at saying only true things all the time, or for me to act like this conveys moral virtue. I'm only even describing the wizard's code publicly now that I can think of at least one alternative.

I once heard somebody claim that rationalists ought to practice lying, so that they could separate their internal honesty from any fears of needing to say what they believed. That is, if they became good at lying, they'd feel freer to consider geocentrism without worrying what the Church would think about it. I do not in fact think this would be good for the soul, or for a cooperative spirit between people. This is the sort of proposed solution of which I say, "That is a terrible solution and there has to be a better way."

But I do see the problem that person was trying to solve. One can also be privileged in stubbornness when it comes to overriding the fear of other people finding out what you believe. I can see how telling fewer routine lies than usual would make that fear even worse, exacerbating the pressure it can place on what you believe you believe; especially if you didn't have a lot of confidence in your verbal agility. It's one more reason not to pressure people (even a little) into adopting the wizard's code, but then it would be nice to have some other code instead.

4: Literal-truth as my automatic norm, maybe not shared.

This set of thoughts started, as so many things do, with a post by Robin Hanson.

In particular Robin tweeted the paper: "The surprising costs of silence: Asymmetric preferences for prosocial lies of commission and omission."

Abstract: Across 7 experiments (N = 3883), we demonstrate that communicators and targets make egocentric moral judgments of deception. Specifically, communicators focus more on the costs of deception to them—for example, the guilt they feel when they break a moral rule—whereas targets focus more on whether deception helps or harms them. As a result, communicators and targets make asymmetric judgments of prosocial lies of commission and omission: Communicators often believe that omitting information is more ethical than telling a prosocial lie, whereas targets often believe the opposite.

This got me wondering whether my default norm of the wizard's code is something other people will even perceive as prosocial. Yes, indeed, I feel like not saying things is much more law-abiding than telling literal falsehoods. But if people feel just as wounded, or more wounded, then that policy isn't really benefiting anyone else. It's just letting me feel ethical and maybe being good for my own personal soul.

Robin commented, "Mention all relevant issues, even if you have to lie about them."

I don't think this is a bullet I can bite in daily practice. I think I still want to emit literal truths for most dilemmas short of hiding fugitives. But it's one more argument worth mentioning against trying to make an absolute wizard's code into a bedrock solution for interpersonal reliability.

Robin also published a blog post about "automatic norms" in general:

We are to just know easily and surely which actions violate norms, without needing to reflect on or discuss the matter. We are to presume that framing effects are unimportant, and that everyone agrees on the relevant norms and how they are to be applied.

In a relatively simple world with limited sets of actions and norms, and a small set of people who grew up together and later often enough observe and gossip about possible norm violations of others, such people might in fact learn from enough examples to mostly apply the same norms the same way. This was plausibly the case for most of our distant ancestors. They could in fact mostly be sure that, if they judged themselves as innocent, most everyone else would agree. And if they judged someone else as guilty, others should agree with that as well. Norm application could in fact usually be obvious and automatic.

Today however, there are far more people, and more intermixed, who grow up in widely varying contexts and now face far larger spaces of possible actions and action contexts. Relative to this huge space, gossip about particular norm violations is small and fragmented...

We must see ourselves as tolerating a lot of norm violation. We actually tell others about and attempt to punish socially only a tiny fraction of the violations that we could know of. When we look most anywhere at behavior details, it must seem to us like we are living in a Sodom and Gomorrah of sin. Compared to the ancient world, it must seem a lot easier to get away for a long time with a lot of norm violations...

We must also see ourselves as tolerating a lot of overeager busybodies applying what they see as norms to what we see as our own private business where their social norms shouldn’t apply.

This made me realize that the wizard's code of honesty I grew up with is, indeed, an automatic norm for me. Which meant I was probably overestimating and eliezeromorphizing the degree to which other people even cared at all, or would think I was keeping any promises by doing it. Again, I don't see this as a good reason to give up on emitting literally true sentences almost all of the time, but it's one more reason I feel more open to alternatives than I would've ten years ago. That said, I do expect a lot of people reading this also have something like that same automatic norm, and I still feel like that makes us more like part of the same tribe.

5: Counterargument: The problem of non-absolute rules.

A proposal like this one ought to come with a lot of warning signs attached. Here's one of them:

There's a passage in John M. Ford's Web of Angels, when the protagonist has finally killed someone even after all the times his mentor taught him to never ever kill. His mentor says:

"No words can prevent all killing. Words are not iron bands. But I taught you to hesitate, to stay your hands until the weight of duty crushed them down."

Surprise! Really the mentor just meant to try to get him to wait before killing people instead of jumping to that right away.

Humans are kind of insane, and there are all sorts of insane institutions that have evolved among us. A fairly large number of those institutions are twisted up in such a way that something explodes if people try to talk openly about how they work.

It's a human kind of thinking to verbally insist that "Don't kill" is an absolute rule, why, it's right up there in the Ten Commandments. Except that what soldiers do doesn't count, at least if they're on the right side of the war. And sure, it's also okay to kill a crazy person with a gun who's in the middle of shooting up a school, because that's just not what the absolute law "Don't kill" means, you know!

Why? Because any rule that's not labeled "absolute, no exceptions" lacks weight in people's minds. So you have to perform that the "Don't kill" commandment is absolute and exceptionless (even though it totally isn't), because that's what it takes to get people to even hesitate. To stay their hands at least until the weight of duty is crushing them down. A rule that isn't even absolute? People just disregard that whenever.

(I speculate this may have to do with how the human mind reuses physical ontology for moral ontology. I speculate that brains started with an ontology for material possibility and impossibility, and reused that ontology for morality; and it internally feels like only the moral reuse of "impossible" is a rigid moral law, while anything short of "moral-impossible" is more like a guideline. Kind of like how, if something isn't absolutely certain, people think that means it's okay to make up their own opinion about it, because if it's not absolutely certain it must not be the domain of Authority. But I digress, and it's just a hypothesis. We don't need to know exactly what is the buried cause of the surface craziness to observe that the craziness is in fact there.)

So you have to perform that the Law is absolute in order to make the actual flexible Law exist. That doesn't mean people lie about how the Law applies to the edge cases—that's not what I mean to convey by the notion of "performing" a statement. More like, proclaim the Law is absolute and then just not talk about anything that contradicts the absoluteness.

And when that happens, it's one more little chunk of insanity that nobody can talk about on the meta-level without it exploding.

Now, you will note that I am going ahead and writing this all down explicitly, because... well, because I expect that in the long run we have to find a way that doesn't require a little knot of madness that nobody is allowed to describe faithfully on the meta-level. So we might as well start today.

I trust that you, the reader, will be able to understand that "Don't kill" is the kind of rule where you give it enough force-as-though-of-absoluteness that it actually takes a deontology-breaking weight of duty to crush down your hands, as opposed to you cheerfully going "oh well I guess there's a crushing weight now! let's go!" at the first sign of inconvenience.

Actually, I don't trust that everyone reading this can do that. That's not even close to literally true. But most you won't ever be called on to kill, and society frowns upon that strongly enough to discourage you anyway. So I did feel it was worth the risk to write that example explicitly.

"Don't lie" is more dangerous to mess with. That's something that most people don't take as an exceptionless absolute to begin with, even in the sense of performing its absoluteless so that it will exist at all. Even extremely honest people will agree that you can lie to the Gestapo about whether you are hiding any Jews in the attic, and not bother to Glomarize your response either; and I think they will mostly agree that this is in fact a "lie" rather than trying to dance around the subject. People who are less than extremely honest think that "I'm fine" is an okay way to answer "How are you?" even if you're not fine.

So there's still a very obvious thing that could go wrong in people's heads, a very obvious way that the notion of "meta-honesty" could blow up, or any other codebesides "don't say false things" could blow up. It's why the very first description in the opening paragraphs says "Don't lie when a normal highly honest person wouldn't, and furthermore…" and you should never omit that preamble if you post any discussion of this on your own blog. THIS IS NOT THE IDEA THAT IT'S OKAY TO LIE SO LONG AS YOU ARE HONEST ABOUT WHEN YOU WOULD LIE IF ANYONE ASKS. It's not an escape hatch.

If anything, meta-honesty is the idea that you should be careful enough about when you break the rule "Don't lie" that, if somebody else asked the hypothetical question, you would be willing to PUBLICLY DEFEND EVERY ONE OF THOSE EXTRAORDINARY EXCEPTIONS as times when even an unusually honest person should lie.

(Unless you were never claiming to be unusually honest, and your pattern of meta-honest responses to hypotheticals openly shows that you lie about as much as an average person. But even here, I'd worry that anyone who lets themselves be as wicked as they imagine the 'average' person to be, would be an unusually wicked person indeed. After all, if Robin Hanson speaks true, we are constantly surrounded by people violating what seem to us like automatic norms.)

6: Meta-honesty, the basics.

Okay, enough preamble, let's speak of the details of meta-honesty, which may or may not be a terrible idea to even talk about, we don't know at this point.

The basic formulation of meta-honesty would be:

"Be at least as honest as an unusually honest person. Furthermore, when somebody asks for it and especially when you believe they're asking for it under this code, try to convey to them a frank and accurate picture of the sort of circumstances under which you would lie. Literally never swear by your meta-honesty that you wouldn't lie about a hypothetical situation that you would in fact lie about."

My first horrible terminology for this was the "Bayesian code of honesty", on the theory that this code meant your sentences never provided Bayesian evidence in the wrong direction. Suppose you say "Hey, Eliezer, what were you doing last night?" and I reply "Staying at home doing the usual things I do before going to bed, why?" If you have a good mental picture of what I would lie about, you have now definitely learned that I was not out watching a movie, because that is not something I would lie about. A very large number of possibilities have been ruled out, and most of your remaining probability mass should now be on me having stayed home last night. You know that I wasn't on a secret date with somebody who doesn't want it known we're dating, because you can ask me that hypothetical and I'll say, "Sure, I'd happily hide that fact, but that isn't enough to force me to lie. I would just say 'Sorry, I can't tell you where I was last night,' instead of lying."

You have not however gained any Bayesian evidence against my hiding a fugitive marijuana seller from the Feds, where somebody's life or freedom is at stake and it's vital to conceal that a secret even exists in the first place. Ideally we'd have common knowledge of that, and hopefully we'd agree that it was fine to lie in that case to a friend who asks a casual-seeming question.

Let's be clear, although this is a kind of softening of deception, it's still deception. Even if somebody has extensively discussed your code of honesty with you, they aren't logically omniscient and won't explicitly have the possibility in mind every time. That's why we should go on holding ourselves to the standard of, "Would I defend this lie even if the person I was defending it to had never heard of meta-honesty?"

"Eliezer," you say, "if you had a temporary schizophrenic breakdown and robbed a bank and this news hadn't become public, would you lie to keep it from becoming public?"

And this would cause me to stop and think and agonize for a bit (which itself tells you something about me, that my answer is not instantly No or Yes). I do have important work to do which should not be trashed without strong reason, and this hypothetical situation would not have involved a great deliberate betrayal on my part; but it is also the sort of thing that you could reasonably argue an unusually honest person ought not to lie about, where lies do not in general serve the social good.

I think in the end I might reply something like "I wouldn't lie freely and would probably try to use at least technical truth or Glomarize, but in the end I might conceal that event rather than letting my work be trashed for no reason. I think I'd understand if somebody else had done likewise, if I thought they were doing good work in the first place. Except that obviously I'd need to tell various people who are engaged in positive-sum trades with me, where it's a directly important issue to them whether I can be trusted never to have mental breakdowns, and remove myself from certain positions of trust. And if it happened twice I'd be more likely to give up. If it got to the point where people were openly asking questions I don't imagine myself as trying to continue a lie. I also want to caveat that I'm describing my ethical views, what I think is right in this situation, and obviously enough pressure can make people violate their own ethics and it's not always predictable how much pressure it takes, though I generally consider myself fairly strong in that regard. But if this had actually happened I would have spent a lot more time thinking about it than the two minutes I spent writing this paragraph." And this would help give you an accurate picture of the sort of person that I am in general, and what I take into account in considering exceptions.

Insofar as you are practicing a mental discipline in being meta-honest, the discipline is to be explicitly aware of every time you say something false, and to ask yourself, "Would I be okay publicly saying, if somebody asked me the hypothetical, that this is a situation where a person ought to lie?"

I still worry that this is not the thing that people need to do to establish their inner pact with truth. Maybe you could pick some friends to whom you just never tell any kind of literal falsehood, in the process of becoming initially aware of how many false things you were just saying all the time… but I don't actually know if that works either. Maybe that's like trying to stop smoking cigarettes on odd-numbered days. It'd be something to notice if the experimental answer is "In reality, meta-honesty turns out not to work for practicing the respect of truth."

Meta-honesty should be for people who are comfortable, not with absolute honesty, but with not trying to appear any more honest than they are. This itself is not the ordinary equilibrium, and if you want to do things the standard human way and not forsake a well-tested and somewhat enforced social equilibrium in pursuit of a bright-eyed novel idealistic agenda, then you should not declare yourself meta-honest, or should let somebody else try it first.

7: Consistent object-level glomarization in meta-level honest responses.

Glomarization can be workable when restricted to special cases, such as only questions about nuclear weapons and submarines. Meta-honesty is such a special case and, if we're doing this, we should all Glomarize it accordingly. In particular meta-questions are not to be used to extract object-level data, and we should all respect that in our questions, and consistently Glomarize about it in our answers, including some random times when Glomarization seems silly.

Some key responses that need to be standard:

"That question sounds too object-level."
"I think you're doing meta-honesty wrong."
"I think I'm supposed to Glomarize that sort of answer in general."
"I should answer a more abstract version of that."
"I worry that some of my counterfactual selves are not in a mutually beneficial situation in this discussion."

And if you clearly say that you "irrevocably worry" about any of these things, it means the meta-honest conversation has crashed; the other person is not supposed to keep pressing you, and if they do, you can lie. Ideally, this is something you should consistently do in any case where a substantial measure of your counterfactual selves as the other person might imagine them would be feeling pressured to the point of maybe meta-lying. That is, you should not only say "irrevocably worry" in cases where you actually have something to conceal, you should say it in cases where the discussion would be pressuring somebody who did have something to conceal and this seems high-enough-probability to you or to your model of the person talking to you.

For example: "Eliezer, would you lie about having robbed a bank?"

I consider whether this sounds like an attempt to extract object-level information from some of my counterfactual selves, and conclude that you probably place very little probability on my having actually robbed a bank. I reply, "Either it is the case that I did rob a bank and I think it is okay to lie about that, or alternatively, my reply is as follows: I wouldn't ordinarily rob a bank. It seems to me that you are postulating some extraordinary circumstance which has driven me to rob a bank, and you need to tell me more about this extraordinary circumstance before I tell you whether I'd lie about it. Or you're postulating a counterfactual version of me that's fallen far enough off the ethical rails that he'd probably stop being honest too."

Some additional statements that ought to be taken as praiseworthy:

"I only feel free to have a frank discussion about that if everyone in the room has agreed to abide by the meta-honesty code."
"I notice that I'm feeling interrogated, and should not try to give a code-abiding answer to that right now."
"It is either the case that this actually happened and I think it is okay to lie about it, or that my current quick guess is that I wouldn't lie in that case."
"Hold on, let me either generate a random number or pretend to generate a random number, such that if I'm actually generating a random number and it comes up as 0, I will try to seem more evasive than usual in this conversation even if I have nothing to actually hide."

This is not supposed to be a clever way to extract information from people and you should shut down any attempt to use it that way.

"Harry," says HPMOR!Dumbledore, "I ask you under the code of meta-honesty (which we have just anachronistically acquired): Would you lie about having robbed the Gringotts Bank?"

Harry thinks, Maybe this is about the Azkaban breakout, and says, "Do you in fact suspect me of having robbed a bank?"

"I think that if I suspected you of having robbed a bank," says Dumbledore, "and I did not wish you to know that, I would not ask you if you had robbed a bank. Why do you ask?"

"Because the circumstances under which you're invoking meta-honesty have something to do with how I answer," says Harry (who has suddenly acquired a view on this subject that some might consider implausibly detailed). "In particular, I think I react differently depending on whether this is basically about you trying to construct a new mutually beneficial arrangement with the person you think I am, or if you're in an adversarial situation with respect to some of my counterfactual selves (where the term 'counterfactual' is standardly taken to include the actual world as one that is counterfactually conditioned on being like itself). Also I think it might be a good idea generally that the first time you try to have an important meta-honest conversation with someone, you first spend some time having a meta-meta-honest conversation to make sure you're on the same page about meta-honesty."

"I am not sure I understood all that," said Dumbledore. "Do you mean that if you think we have become enemies, you might meta-lie to me about when you would lie?"

Harry shook his head. "No," said Harry, "because then if we weren't enemies, you would still never really be able to trust what I say even assuming me to abide by my code of honesty. You would have to worry that maybe I secretly thought you were an enemy and didn't tell you. But the fact that I'm meta-honest shouldn't be something that you can use against me to figure out whether I… sneaked into the girl's dorm and wrote in somebody's diary, say. So if I'm in that situation I've got to protect my counterfactual selves and Glomarize harder. Whereas if this is more of a situation where you want to know if we can go to Mordor together, then I'd feel more open and try to give you a fuller picture of me with more detail and not worry as much about Glomarizing the specific questions you ask."

"I suspect," Dumbledore said gravely, "that those who try to be honest at all will always be at something of a disadvantage relative to the most ready liars, at least if they've robbed Gringotts. But yes, Harry, I am afraid that this is more of a situation where I am… concerned… about some of your counterfactual selves. But then why would you answer at all, in such a case?"

"Because sometimes people are honest and have good intentions," answered Harry, "and I think that if in general they can have an accurate picture of the other person's honesty, everybody is on net a bit better off. Even if I had robbed a bank, for example, you and I would both still not want anything bad to happen to Britain. And some of my counterfactual selves are innocent, and they're not better off if you think I'm more dishonest than I am."

"Then I ask again," said Dumbledore, "under the code of meta-honesty, whether you would lie about having robbed a bank."

"Then my answer is that I wouldn't ordinarily rob a bank," Harry said, "and I'd feel even worse about lying about having robbed a bank, than having robbed a bank. And I'd know that if I robbed a bank I'd also have to lie about it. So whatever weird reason made me rob the bank, it'd have to be weird enough that I was willing to rob the bank and willing to lie about it, which would take a pretty extreme situation. Where it should be clear that I'm not trying to answer about having specifically robbed a bank, I'm trying to give you a general picture of what sort of person I am."

"What if you had been blackmailed into robbing the bank?" inquired Dumbledore. "Or what if things crept up on you bit by bit, so that in the end you found yourself in an absurd situation you'd never intended to enter?"

Harry shrugged helplessly. "Either it's the case that I did end up in a weird situation and I don't want to let you know about that, or alternatively, I feel like you're describing a very broad range of possibilities that I'd have to think about more, because I haven't yet ended up in that kind of situation and I'm not quite sure how I'd behave… I think I'd have in mind that just telling the Headmaster the truth can prevent big problems from blowing up any further, but there'd be cases extreme enough that I wouldn't do that either… I mean, the basic answer is, yes, there's things that would make me lie right to your face, but, I wouldn't do that just for having stolen candy from the kitchen, I don't think. I'd just be like 'I consistently Glomarize when people ask me if I've stolen candy from the kitchen.'"

"Would you lie to protect a fugitive from the law who you believed was being unjustly pursued?" said Dumbledore. "Say that Sirius Black was innocent after all, as the Quibbler would have it; would you hide him and then lie to Minerva about what you were doing on the night you hid him?"

"Yes," Harry said unhesitatingly. "That's something I think is clearly right and the sort of thing I would ordinarily do."

"What if you thought that somebody was being unjustly pursued by the law?" said Dumbledore. "Is that worth lying for?"

Harry frowned. "I think I maybe shouldn't answer questions about what in particular I think ought not to be illegal," Harry said, "but if the basic question is whether I'd lie to my friends to keep somebody who'd only used marijuana out of Azkaban, then I would. I don't think my friends should expect otherwise; they should want me to do that."

"We are not being meta-honest any more, only honest," said Dumbledore. "Were you involved in the breakout from Azkaban?"

Harry stared at Dumbledore a moment, and said, "I think I have to consistently not confirm or deny that, but my counterfactual and possibly actual self who wasn't involved would really like to know at this point what the heck you think might've happened."

If somehow the Gestapo has heard of meta-honesty and they ask you if you would lie to hide Jews in your attic, they should be expecting a law-abiding meta-honest Nazi citizen to Glomarize, abstract the question in such a way as to not reveal the object-level information that might be sought, or say "I irrevocably worry that some of my counterfactual selves are not in a positive-sum situation." That, I now proclaim, ought to be done by Nazis and resistance fighters alike, when you are faced with a question that might be meant to reveal object-level information about what happened.

"Eliezer," says the hypothetical Gestapo officer who has somehow heard about my meta-honesty code, "it happens that I'm a person who's heard of meta-honesty. Now, are you the sort of person who would lie about having Jews hidden in your attic?"

This hypothetical Gestapo officer has a gun. Most people asking you meta-honest questions won't have a gun. In fact I bet this will literally never happen until the end of the world. Let's suppose he has a gun anyway.

"I am the following sort of person," I reply. "If I was hiding the Führer in my attic to protect him from Jewish assassins, I'd lie about that to the assassins. It's clear you know about my code of meta-honesty, so you should understand that is a very innocent thing to say. But these circumstances and the exact counterfactual you are asking make me nervous, so I'm afraid to utter the words I think you may be looking for, namely the admission that if I were the kind of person who'd hide Jews in his attic then I'd be the kind of person who would lie to protect them. Can I say that I believe that in respect to your question as you mean it, I think that is no more and no less true of me than it is true of you?"

"My, you are fast on your verbal feet," says the Gestapo officer. "If somebody were less fast on their verbal feet, would you tell them that it was acceptable for a meta-honest person to just meta-lie to the Jewish assassins in order to hide the Führer?"

"If they didn't feel that their counterfactual loyal Nazi self would think that their counterfactual disloyal self was being pressured and clearly state that fact irrevocably," I say, "I'd say that, just like their counterfactual loyal self, they should make some effort to reveal the general limits of their honesty without betraying any of their counterfactual selves, but say they irrevocably couldn't handle the conversation as soon as they thought their alternate loyal self would think their alternate's counterfactual disloyal self couldn't handle the conversation. It's not as if the Jewish assassins would be fooled if they said otherwise. If the Jewish assassins do continue past that point, which is blatantly forbidden and everyone should know that, they may lie."

"I see," says the Gestapo officer. "If you are telling me the truth, I think I have grasped the extent of what you claim to be honest about." He turns to his subordinates. "Go search his attic."

"Now I'm curious," I say. "What would you have done if I'd sworn to you that I was an absolutely loyal German citizen, and that my character was such that I would certainly never lie about having Jews in my attic even if I were the sort of disloyal citizen who had Jews in his attic in the first place?"

"I would have detailed twice as many men to search your house," says the Gestapo officer, "and had you detained, for that is not the response I would expect from an honest Nazi who knew how meta-honesty was supposed to work. Now I ask you meta-meta-honestly, why haven't you said that you are irrevocably worried that I am abusing the code? Obviously I put substantial probability on you being a traitor, meaning I am deliberately pressuring you into a meta-conversation and trying to use your code of honesty against those counterfactual selves. Why didn't you just shut me down?"

"Because you do have a gun, sir," I say. "I agree that it's what the rules called for me to say, but I thought over the situation and decided that I was comfortable with saying that in general this was a sort of situation where that rule could be bent so as for me to not end up being shot—and I tell you meta-meta-honestly that I do believe the situation has to be that extreme in order for that rule to even be bent."

Really the principle is that it is not okay to meta-ask what the Gestapo officer is meta-asking here. This kind of detailed-edge-case-checking conversation might be appropriate for shoring up the edges of an interaction intended to be mutually beneficial, but absolutely not for storming in looking for Jews in the attic of a person who in your mind has a lot of measure on having something to hide.

But I do want to have trustworthy foundations somewhere.

And I think it's reasonable to expect that over the course of a human lifetime you will literally never end up in a situation where a Gestapo officer who has read this essay is pointing a gun at you and asking overly-object-level-probing meta-honesty questions, and will shoot you if you try to glomarize but will believe you if you lie outright, given that we all know that everyone, innocent or guilty, is supposed to glomarize in situations like that. Up until today I don't think I've ever seen any questions like this being asked in real life at all, even hanging out with a number of people who are heavily into recursion.

So if one is declaring the meta-honesty code at all, then one shouldn't meta-lie, period; I think the rules have been set up to allow that to be absolute. I don't want you to have to worry that maybe I think I'm being pressured, or maybe I thought you meta-asked the wrong thing, so now I think it's okay to meta-lie even though I haven't given any outward sign of that. To that end, I am willing to sacrifice the very tiny fraction of the measure of my future selves who will end up facing an extremely weird Gestapo officer. To me, for now, there doesn't seem to be any real-life circumstance where you should lie in response to a meta-honesty question—rather than consistently glomarize that kind of question, consistently abstract that kind of question, consistently answer in an analogy rather than the original question, or consistently say "I believe some counterfactual versions of me would say that cuts too close to the object level." (It being a standard convention that counterfactuals may include the actual.)

I also think we can reasonably expect that from now until the end of the world, honest people should literally absolutely never need to evade or mislead at all on the meta-meta-level, like if somebody asks if you feel like the meta-level conversation has abided by the rules. (And just like meta-honesty doesn't excuse object-level dishonesty, by saying that meta-meta-honesty seems like it could be everywhere open and total, I don't mean to excuse meta-level lies. We should all still regard meta-lies as extremely bad and a Code Violation and You Cannot Be Trusted Anymore.)

If there's a meta-honest discussion about someone's code of honesty, and a discussion of what they think about the current meta-meta conditions of how the meta-honesty code is being used, and it sounds to you like they think things are fine… then things should be fine, period. If you ask, do they think that any pressure strong enough to potentially shake their meta-honesty is potentially around, do they think that the overall situation here would have treated any of their plausible counterfactual selves in a negative-sum way, and they say no it's all fine—then that is supposed to be absolute under the code. That ought to establish a foundation that's as reliable as the person's claim to be meta-honest at all.

If you go through all that and lie and meta-lie and meta-meta-lie after saying you wouldn't, you've lied under some of the kindest environments that were ever set up on this Earth to let people not lie, among people who were trying to build trust in that code so we could all use it together. You are being a genuinely awful person as I'd judge that, and I may advocate for severe social sanctions to apply.

Assuming this ends up being a thing, that is. I haven't run it past many people yet and this is the first public discussion. Maybe there's some giant hole in it I haven't spotted.

If anybody ever runs into an actual real circumstance where it seems to them that meta-honesty as they tried to use it was giving the essay-reading Gestapo too much power or too much information, maybe because they weren't fast enough on their verbal feet, please email me about it so I can consider whether to modify or backtrack on this whole idea. I will try to protect your anonymity under all circumstances up to and including the end of the world unless you say otherwise. The previous sentence is not the sort of thing I would lie about.

8: Counterargument: Maybe meta-honesty is too subtle.

I worry that the notion of meta-honesty is too complicated and subtle. In that it has subtleties in it, at all.

This concept is certainly too subtle for Twitter. Maybe it's too subtle for us too.

Maybe "meta-honesty" is just too complicated a concept to be able to make it be part of a culture's Law, compared to the standard-twistiness-compliant performance of saying "Always be honest!" and waiting for the weight of duty to crush down people's hands, or saying "Never say anything false!" and just-not-discussing all the exceptions that people think obviously don't count.

(But of course that system also has disadvantages, like people having different automatic norms about what they think are obvious exceptions.)

I've started to worry more, recently, about which cognitive skills have other cognitive skills as prerequisites. One of the reasons I hesitated to publish Inadequate Equilibria (before certain persons yanked it out of my drafts folder and published it anyway) was that I worried that maybe the book's ideas were useless or harmful without mastery of other skills. Like, maybe you need to have developed a skill for demotivating cognition, and until then you can't reason about charged political issues or your startup idea well enough for complicated thoughts about Nash equilibria to do more good than harm. Or maybe unless you already know a bunch of microeconomics, you just stare at society and see a diffuse mass of phenomena that might or might not be bad equilibria, and you can't even guess non-wildly in a way that lets you get started on learning.

Maybe meta-honesty contains enough meta, in that it has meta at all, that it just blows up in most people's heads. Sure, people in our little subcommunity tend to max out the Cognitive Reflection Test and everything that correlates with it. But compared to scoring 3 out of 3 on the CRT, the concept of meta-honesty is probably harder to live in real life—stopping and asking yourself "Would I be willing to publicly defend this as a situation in which unusually honest people should lie, if somebody posed it as a hypothetical?" Maybe that just gets turned into "It's permissible to lie so long as you'd be honest about whether you'd tell that lie if anyone asks you that exact question and remembers to say they're invoking the meta-honesty code," because people can't process the meta-part correctly. Or maybe there's some subtle nonobvious skill that a few people have practiced extensively and can do very easily, and that most people haven't practiced extensively and can't do that easily, and this subskill is required to think about meta-honesty without blowing up. Or maybe I just get an email saying "I tried to be meta-honest and it didn't work because my verbal SAT score was not high enough, you need to retract this."

If so, I'm not sure there's much that could be done about it, besides me declaring that Meta-Honesty had turned out to be a terrible idea as a social innovation and nobody should try that anymore. And then that might not undo the damage to the law-as-absolute performance that makes something be part of the Law.

But I'd outright lie to the Gestapo about Jews in my attic. And even to friends, I can't consistently Glomarize about every point in my life where one of my counterfactual selves could possibly have been doing that. So I can't actually promise to be a wizard, and I want there to exist firm foundations somewhere.

Questions? Comments?

New to LessWrong?

134

Mentioned in

215Elements of Rationalist Discourse

156The Onion Test for Personal and Institutional Honesty

1352018 Review: Voting Results!

122Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think

113The LessWrong 2019 Review

Load More (5/22)

Meta-Honesty: Firming Up Honesty Around Its Edge-Cases

29th May 2018

61lionhearted (Sebastian Marshall)

5Swimmer963 (Miranda Dixon-Luinenburg)

New Comment

154 comments, sorted by

top scoring

Click to highlight new comments since: Today at 9:22 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]lionhearted (Sebastian Marshall)6y611

Perhaps the most insightful comment I ever read on Hacker News went something like,

One of the big problems for startup founders who are immigrants is not knowing what they're expected to lie about, and what they're absolutely forbidden to lie about. If you cook the books, you go to jail for fraud. But if you're very honest when a VC asks you, 'Who else is thinking of investing in you?' and you answer, 'It's only you, no one else is interested' — then you're never going to get investment. You're expected to lie and say, 'Oh, there's a lot of interest.'

I can't find the exact comment but I found that very insightful.

[-]Rob Bensinger6y420

The first time I read this, I think my top-level personal takeaway was: 'Woah, this is complicated. I can barely follow the structure of some of these sentences in section 7, and I definitely don't feel like I've spent enough time meditating on my counterfactual selves' preferences or cultivating a wizard's metacognitive habits to be able to apply this framework in a really principled way. This hard-to-discuss topic seems like even more of a minefield now.'

My takeaways are different on a second read:

1. Practicing thinking and talking like a wizard seems really, really valuable. (See Honesty: Beyond Internal Truth.) Bells and whistles like "coming up with an iron-clad approach to Glomarization" seem much less important, and shouldn't get in the way of core skill-building. It really does seem to make me healthier when I'm in the mindset of treating "always speak literal truth" as my fallback, complicated meta-honesty schemes aside.

2. It's possible to do easier modified versions of the thing Eliezer's talking about. E.g., if I'm worried that I'm not experienced or fast enough on my feet to have a meta-hon... (read more)

[-]Vaniver6y410

The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true. (Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)

I think this is missing the important role of question-substitution in basic social encounters. When you ask "How are you?" and I respond with "I'm fine," the question I'm actually answering is not my physical or emotional state but instead the questions of "do you need assistance?" or "Is there anything I should know?" with a response that codes to "no". So if my knee is hurting, in a normal conversation I might respond with "I'm fine" because I expect that information to not be useful to them (and is a bid for a demonstration of care that I'm not interested in), whereas in cases where this might af... (read more)

[-]abramdemski6y140

The issue of defining "literal honesty" seems pretty subtle. Allowing reasonable stretching of "literal" to "my concept of what people meant by the question" is ... well, reasonable, but also not very consistent with drawing a clear line separating 'honest' from 'dishonest'. Another issue is that if the definition of honesty is

"Don't say things that you believe to be literally false in a context where people will (with reasonably high probability) persistently believe that you believe them to be true."

seems to admit lying when everyone knows you are lying. IE, someone who everyone assumes to be a liar is "literally honest" no matter what they say! I take this to suggest that our definition has to include intent, not just expectation. But how to modify the definition to avoid further trouble is unclear to me.

Given the difficulties, it seems like one who wishes to adhere to 'literal honesty' had better err on the side of literalness, clarifying any issues of interpretation as they arise. Being very literal in your answers to "how are you" may be awkward in an individual case, but as a pattern, it sets up expectations about the sort of replies you give to questions.

On the object level, I disagree about the usual meaning of "how are you?" -- it seems to me like it is more often used as a bid to start a conversation, and the expected response is to come up with smalltalk about your day / what you've been up to / etc.

2Hgbanana1231y

I think that is actually in line with the "bayesian honesty" component/formulation of the proposal. If one is known to universally lie, one's words have no information content, and therefore don't increase other people's bayesian probabilities of falsy statements. However, it seems this is not a behaviour that Eliezer finds morally satisfactory. (I agree with Rob Bensinger that this formulation is more practical in daily life)

[-]Ben Pace4y380Review for 2018 Review

Here are my thoughts.

Being honest is hard, and there are many difficult and surprising edge-cases, including things like context failures, negotiating with powerful institutions, politicised narratives, and compute limitations.
On top of the rule of trying very hard to be honest, Eliezer's post offers an additional general rule for navigating the edge cases. The rule is that when you’re having a general conversation all about the sorts of situations you would and wouldn’t lie, you must be absolutely honest. You can explicitly not answer certain questions if it seems necessary, but you must never lie.
I think this rule is a good extension of the general principle of honesty, and appreciate Eliezer's theoretical arguments for why this rule is necessary.
Eliezer’s post introduces some new terminology for discussions of honesty - in particular, the term 'meta-honesty' as the rule instead of 'honesty'.
If the term 'meta-honesty' is common knowledge but the implementation details aren't, and if people try to use it, then they will perceive a large number of norm violations that are actually linguistic confusions. Linguistic confusions are not strongly negative in most fields, merely a nuisan

... (read more)

3Ben Pace4y

I don’t really stand by the last half of the points above, I.e. the last ~3rd of the longer review. I think there’s something important to say here about the relationship between common knowledge and deontology, but that I didn’t really say it and I said something else instead. I hope to get the time to try again to say it.

[-]Said Achmiz6y350

Also, not everyone may be familiar with this “Glomarization” thing, so here’s a Wikipedia link:

https://en.wikipedia.org/wiki/Glomar_response

[-]habryka6y210

Promoted to curated, here are my thoughts:

I think Robby's comment captures a lot of my thoughts on this post. This was the third time I read this post, and I think it was the first time that I started getting any grasp on the core concepts of the post. I think there are two primary reasons for this:

1. The concepts are indeed difficult, and are recursive and self-referential in a way that requires a good amount of unpacking and time to understand them

2. The focus of the post shifts very quickly from "here is a crash-course introduction to the wizard's code" to "here is a crash-course introduction to meta-honesty" to "here is a discussion about whether meta-honesty is good" and "here is an introduction to the considerations around meta-meta-honesty".

I think it's good to have a post that tries to give some kind of overview over the considerations around meta-honesty and rational honesty in general, and that that is better than only having a single educational introductory post that can't give a sense of the bigger picture.

But I do think that the next natural step after this high-level discussion, if we think meta-honesty i... (read more)

[-]Rob Bensinger6y110

Yeah, my feeling on re-reading this post is that it would have worked well as a sequence, since it breaks down into a bunch of parts that are important to consider and digest in their own right. (And since it would have benefited from more background, motivation, exercises, examples, etc.)

Also, to give a personal +1 to bounties and to this particular goal, I'll give another $40 to whoever collects Oliver's bounty, as judged by Oliver.

9Raemon6y

I happened to be re-reading this and figured it might be nice to raise this to conscious attention, and (possibly?) raise the bounty if no one has made any progress on this. (Or, alternately, give people some affordance to state their happy price for doing it and see what the supply/demand tradeoffs are like)

[-]Zvi6y200

In practice, I've found that it's possible to keep a lot of information secret without the need to either lie, or do lots of extra Glomarizing to avoid the act of Glomarizing giving too much away. Rather than have a stated, deterministic solution to each possible question or problem, I do what seems practical given the situation and who is asking.

One tactic I've found necessary is to just not talk about entire topics, or not write entire posts, that would put me in a position where I'd be backed into a corner and Glomarizing would be too suspicious to pull off without giving the game away, and of course not talking about which posts/topics those are. The alternative, to do sufficient Glomarizing 'in the open,' would cut off a lot more discussion/information on net and also be more socially costly.

In general, there's a temptation to do things that are game-theoretically robust - where if they could see your source code and decision algorithm, you'd still be all right, or at least do as well as possible given the circumstances. This is of course hugely important to various scenarios important for AI, where you actually do face such circumstances. But in reality, it's usually right to do things that are hugely exploitable if they could see what you're up to - e.g. to not Glomarize 'enough' even though that opens you up to problems.

6Hazard6y

(If you feel this is a compromising question, feel free to not answer) What are the costs that you've seen come with the "avoid an entire topic" approach? For myself, I can imagine some topics that contain sensitive information, but that I'd also feel incredibly constrained to not talk about. I'm wondering if you don't experience many costs, or if you feel that's whatever costs you incur are just the price of keeping information secret.

[-]Zvi6y120

Thanks for explicitly giving me the out not to answer, I think that's 'doing it right' here.

Not being able to talk about things really sucks! Especially because the things you're actually thinking about a lot, and are the most interesting to you, are more likely to include information you can't share, for various reasons.

On the flip side, there are also topics one can't talk about because of worry that it would expose information about one's opinions rather than secret facts. This can be annoying, but it's also a good way to avoid things that you should, for plenty of other reasons, know better than to waste one's time on!

5[anonymous]4y

As someone who uses this strategy as their default, It's really hard. I avoid talking irl about rationality and being trans. The latter is pretty easy since it's not really a big part of who I am. But avoiding any hint of rationality and maintaining a mask of normality is exhausting. It's not just not answering questions, it's about not creating situations that lead to the questions. It is said in the sequences that if you tell one lie, the truth is ever after your enemy. That's an exaggeration but not by much. AI, aging, genetics, all sorts of things are dangerous topics due to their proximity to my weirdness. I have to model the reactions to everything I say one or two steps ahead and if I get it wrong I have to evade or misdirect. This has gotten a lot harder since I started studying rationality and had my head stuffed full of exciting concepts that are difficult to explain and sparkly enough to be difficult to think past. It should be obvious from this that I don't practice honesty in general, but I usually answer a direct question with honesty to mitigate the costs somewhat. Less visible costs are that I'll never meet a rationalist in real life (barring intentional meetups). I get to practice the virtue of argument a lot less... although being cut off from people has some serious advantages as well. There's probably others, but what's the alternative? Not everyone can be Yudkowsky and I just want to live my life in peace.

[-]Said Achmiz6y190

My question is: there seems to be a good deal of context missing. What was the motivation for this post? What conversation context was it taken from? It’s difficult to interpret it, without that information.

[-]Rob Bensinger6y260

Some context from Eliezer's Honesty: Beyond Internal Truth (in 2009):

[...] What I write is true to the best of my knowledge, because I can look it over and check before publishing. What I say aloud sometimes comes out false because my tongue moves faster than my deliberative intelligence can look it over and spot the distortion. Oh, we're not talking about grotesque major falsehoods - but the first words off my tongue sometimes shade reality, twist events just a little toward the way they should have happened...

From the inside, it feels a lot like the experience of un-consciously-chosen, perceptual-speed, internal rationalization. I would even say that so far as I can tell, it's the same brain hardware running in both cases - that it's just a circuit for lying in general, both for lying to others and lying to ourselves, activated whenever reality begins to feel inconvenient.

There was a time - if I recall correctly - when I didn't notice these little twists. And in fact it still feels embarrassing to confess them, because I worry that people will think: "Oh, no! Eliezer lies without even thinking! He's a pathological liar!" For they ha

... (read more)

6Said Achmiz6y

I see, thanks. For the benefit of others reading this, then, here’s what I consider to be the best presentation of the opposite view (the one Eliezer mentions, but rejects, in the first linked post): Paul Christiano’s “If we can’t lie to others, we’ll lie to ourselves”.

4TAG6y

Seconded. A lot of recent postings have had this problem...they seem to start in the middle, or be reports of conversations where a lot of idiosyncratic vocabulary was developed.

4Matt Goldenberg6y

I was going to make this same comment. Without context, seems like a lot of fixing something that ain't broke.

8RHollerith6y

Maybe what is going on here is that you are satisfied with your brain's current ability to make ethical choices, but Eliezer isn't, and his efforts to improve have yielded some thoughts worth putting on the public internet to try to help others who are also dissatisfied with their brain's current ability to make ethical choices.

1Matt Goldenberg6y

Maybe, or maybe there's a different context entirely. As Said says, there really wasn't much context to this at all.

3Raemon6y

The original FB comment version of this post came with the same (lack of) context as it did here, and my impression was that this was something close to rholerith's take. (I also think that this part of some ongoing thoughts that Local Validity was also exploring, but am not sure)

[-]Raemon6y230

There's something I've seen some rationalists try for, which I think Eliezer might be aiming at here, which is to try and be a truly robust agent.

Be the sort of person that Omega (even a version of Omega who's only 90% accurate) can clearly tell is going to one-box.

Be the sort of agent who cooperates when it is appropriate, defects when it is appropriate, and can realize that cooperating-in-this-particular-instance might look superficially like defecting, but avoid falling into a trap.

Be the sort of agent who, if some AI engineers were whiteboarding out the agent's decision making, they were see that the agent makes robustly good choices, such that those engineers would choose to implement that agent as software and run it.

Not sure if that's precisely what's going on here but I think is at least somewhat related. If your day job is designing agents that could be provably friendly, it suggests the question of "how can I be provably friendly?"

7Chevron6y

This is very close to what I've always seen as the whole intent of the Sequences. I also feel like there's a connection here to what I see as a bidirectional symmetry of the Sequences' treatment of human rationality and Artificially Intelligent Agents. I still have trouble phrasing exactly what it is I feel like I notice here, but here's an attempt: As an introductory manual on improving the Art of Human Rationality, the hypothetical perfectly rational, internally consistent, computationally Bayes-complete superintelligence is used as the Platonic Ideal of a Rational Intelligence, and the Sequences ground many of Rationality's tools, techniques, and heuristics as approximations of that fundamentally non-human ideal evidence processor. or in the other direction: As an introductory guide to building a Friendly Superintelligence, the Coherent Extrapolated Human Rationalist, a model developed from intuitively appealing rational virtues, is used as a guide for what we want optimal intelligent agents to look like, and the Sequences as a whole are about taking this more human grounding, and justifying it as the basis on which to guide the development of AI into something that works properly, and something that we see as Friendly. Maybe that's not the best description, but I think there's something there and that it's relevant to this idea of trying to use rationality to be a "truly robust agent". In any case I've always felt there was an interesting parallel with how the Sequences can be seen as "A Manual For Building Friendly AI" based on rational Bayesian principles, or "A Manual For Teaching Humans Rational Principles" based on an idealized Bayesian AI.

6Hazard6y

I really like this phrasing. Previously, I've just had vague sense that there are ways to have a lot more integrity such that many situations get a huge utility bump, which can only be achieved through very clear thinking. Your comment helped me make that a lot more concrete.

1Viliam6y

You must have missed the recent news. Short version is that Peter Thiel got angry at Eliezer's post about Trump, and he decided to send no more money to MIRI. To secure money for further AI research a few rationalists from Berkeley attempted to rob a bank; things got messy, their attempt at "acausal negotiation" about the hostages failed (predictably, duh); the only good news is that no one got killed. Now Eliezer is trying to lay some groundwork to mitigate the PR damage, in my humble opinion not very convincingly. (ROT13: abcr, whfg xvqqvat.)

-2Said Achmiz6y

xrx

[-]Zack_M_Davis4y150Review for 2018 Review

Reply: "Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think"

[-]jessicata6y140

The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true.

If you are trying to be unusually honest as a matter of policy, there are some things it is worth lying for under some circumstances. This is not one of them. Quoting Calvin and Hobbes: "I don't know which is worse: that everyone has his price, or that the price is always so low."

[-]Benquo6y230

This comment thread contributed to a substantial personal update for me over the weekend. I noticed ways in which I was out of integrity with myself. I've moved a lot closer to something like a radical honesty practice over the weekend, and it has worked out pretty well so far.

I stopped blocking my perception of my own suffering, and noticed that my mind-body is full of grief and fatigue. Naturally, noticing thing caused it to start showing up in my body language, and I also started talking about things that were bad for me. It turned out that while this was sometimes upsetting for the people around me, it also allowed us to negotiate in better faith than before.

I think I'd been suppressing this in part because it seemed like the people around me couldn't handle it. This still might turn out to be the case sometimes, but I myself have sufficient privilege to be able to safely handle other people not being able to handle it, so I may as well not destroy my soul :)

I feel physically better.

Thank you for holding me to account. Jessica, I know you didn't explicitly target your intervention at me, but your comments here were sufficiently interpretable for someone trying to learn from them to apply them to their personal situation anyway.

7Qiaochu_Yuan6y

Benquo, this is really great to hear. This is a shift I went through gradually throughout 2017 and I think it's really important.

9Zvi6y

Despite being mostly unusually honest, my dad used to respond to this with "can't complain" which meant you knew he was a lying liar. I assure you, he can complain. What he can't do, easily, is break social convention or do slightly socially awkward things, even when the costs of not doing so are very high. When I see this pattern of obvious lying (as opposed to saying something content-free when it's plausible there's no content to share), I presume that such a person is terrified of being slightly socially awkward. I don't see what's so hard about finding a socially acceptable set of Exact Words that is both true and fails to give evidence for false things given the circumstances. Not only won't I lie in response to such questions, my responses contain information. The whole point of this pattern is to break the pattern when called for, exactly as much as is called for.

6Benquo6y

It sounds like "can't complain" was, if construed locally rather than globally, a self-fulfilling prophecy.

9Zvi6y

I assure you the prophecy was rarely fulfilled.

6Said Achmiz6y

Meanwhile, it seems to me that this is one of the few things about which it is most clearly worth lying about, almost all the time. There are many situations when the justifiability of a lie may seriously be questioned, but this sort of situation seems unusually clear-cut.

[-]jessicata6y140

Why does it seem unusually clear-cut to you that everyone should take part in a ritual in which one person acts like they care about the other's emotional state by asking about it, the other one lies about it in order to maintain the narrative that Things Are Fine (even in cases where things aren't fine and that would be important information to someone who actually cared about them), then sometimes they switch roles?

9Raemon6y

This is my preferred, in-depth answer to some of the implied questions here. https://www.lesswrong.com/posts/huRxRzwcvwTzvtEPY/handshakes-hi-and-what-s-new-what-s-going-on-with-small-talk

[-]Benquo6y120

While I still endorse my description of what's going on there (and thanks for linking it!), in hindsight it seems like I'm describing something that has some pretty substantial costs - as I mentioned in another thread it literally prevents me from actually asking the question "how are you?".

It does make sense to exercise some care here, as one of the effects of this social ritual is to compel disprivileged people to participate in creating a shared narrative in which they're fine, everything is fine, can't complain or else I'll be socially attacked, how are you? Asking such people not to lie in response to "how are you?" may sometimes not be a reasonable request.

8Said Achmiz6y

I’m not sure where you got this really quite bizarre interpretation of my comment, but I suppose I had better clarify: Whether you should ask about the well-being[1] of people you interact with is, of course, entirely up to you, and your own personal considerations surrounding whether to engage in certain rituals, follow certain patterns of social interaction, etc. etc., and in any case is quite beyond the scope of this discussion. What I am saying, however, is that when you are asked such a question, it is entirely permissible—indeed, usually both expected by the asker, and prudent and right for the one asked—to lie, freely and without the slightest shred of guilt. (Indeed, if you think that routinely giving replies like “Horrible!” to a so-called “question” like “How are you?” even constitutes honesty, then I daresay you are confused about just what exactly is going on in said interaction.) [1] I am also not sure why you suggest that the “How are you?” question is an inquiry about emotional state, but that’s neither here nor there.

5jessicata6y

To be clear, by "everyone should participate" I meant "forall x. x should participate" not "everyone as a whole should participate"; sorry for the ambiguous wording. But in any case: what is there here to be worth lying about? Avoiding being slightly awkward? I have often answered this question by actually talking about things that have been going on in my life lately, and that pretty much always goes well. And I don't even think answering "Horribly!" would cause serious negative consequences (e.g. scapegoating). Maybe this is partially due to my privilege level, but I just don't see what is so bad about answering the question honestly, if you are already pretty committed to being unusually honest. I think "How are you?" is a question and that answering "Horribly!" to that question would be honest if, in fact, it is going horribly, while "good" would not be honest. Why? (a) If you look at the syntax, that is what is actually going on. (b) The common ritual probably started with people actually being concerned about each other's state, then got corrupted once people started lying and asking when they weren't actually concerned. (c) The current state of affairs actually makes it hard to ask someone about their state; it seems optimized against creating clarity in an important way, and it seems appropriate for someone trying to be unusually honest to break the ritual by creating clarity. (good point about emotional state vs. state in general, that seems right)

6Dacyn6y

It definitely looks like privilege to me. Almost all the time I am asked this question, the kinds of things that are bothering me in my life are not the kinds of things that it is OK to talk about.

[-]Dacyn6y100

Update: I moved to Berkeley last week and noticed a huge difference in how the rationalist/EA community deals with these sorts of conversations and how the rest of the world does. Yesterday I was talking to someone I had barely met and they asked "how are you doing?" I said "you just opened a whole can of worms" and we ended up having an interesting discussion, including about how the conversational norms are different here from elsewhere. In general, I think people in this community are both more likely to give an honest answer to such questions, and less likely to ask them if they aren't interested in an honest answer.

7Said Achmiz6y

I have noticed this myself, in my interactions with rationalist communities. In my experience, the latter fact (“less likely to ask … if they aren’t interested in an honest answer”) makes rationalist gatherings/spaces feel a lot less welcoming and friendly than their “normal-person” analogues. (This is part of a general trend of failing to perform politeness norms—either due to ignorance thereof, or active refusal, or some combination of both causes. It is quite unfortunate, and makes participation in rationalist gatherings/spaces a less pleasant experience than I wish it were.) (And, of course, the former fact—“more likely to give an honest answer to such questions”—make it more difficult to interact with rationalist-type folks for other reasons, which have been discussed elsethread.)

5Dacyn6y

Yeah, there are definitely both upsides and downsides. It certainly makes me feel more welcome, though I can see that many people would have the opposite experience. Maybe the important thing is that people know what they are getting into.

5Raemon6y

It's worth noting: I fall pretty cleanly in the "smalltalk is a useful skill and social lubrication ritual" camp, and I think it's pretty achievable to get the best of both worlds here. The exchange "how are you" --> "you just opened up a whole can of worms" --> [Whatever Comes Next] is actually pretty reasonable. Person A showcases that they're at least somewhat interested in interacting. Person B makes a bid for "I'd like to have a bit of a heart-to-heart as opposed to a low-key-professional-interaction". Person A now has the ability to say either: "Oh, I'm happy to open up a can of worms", or, "Oh man, hope you're okay. [ "I'm not sure I can dive into that right now" / "I have to get going in a few minutes but interested in the medium-version if that exists" / etc] This seems pretty close to how SmallTalkAsSocialRitual is supposed to work, with the main difference between rationalist and typical-society versions being that the Regular Society version would replace some of the direct-question-ness with subtler facial expressions.

6Said Achmiz6y

That is, actually, what I assumed you meant, so consider my comments to stand unchanged. Well, all I can tell you is that if we’re casual acquaintances, or coworkers, or similar, and I greet you with “How are you?” and you respond by talking about things that have been going on in your life lately, that will make me quite a bit less likely to want to interact with you at all henceforth. My sense is that this is a very common reaction. The “privilege” framing may not be all that useful here… let us say, rather, that you quite likely live in a… strongly-selected-for-unusual-traits social bubble. This is an exceedingly naive view of language. You wouldn’t base your behavior on a sociological just-so-story which you haven’t actually verified… would you? To the contrary; it does no such thing. What it does, rather, is enforce structures of social relationships, against forces which would otherwise erode them, to the detriment of those parties who are already the less-powerful ones in said relationships. Relatedly, when asking someone something, it is important to consider not just whether you desire some information, but whether they wish for you to have it. For bonus points, also consider how patterns in the possession and exchange of information relate to social power relations, and what patterns of interaction have what game-theoretic consequences for said relations. (I am, of course, only pointing at certain clusters of knowledge and understanding, while providing no details, for the simple reason that providing those details is the work of books, not of single forum comments… yet saying nothing at all would be worse than indicating, at least, the existence of relevant facts.)

[-]Vaniver6y260

[This is written as a moderator, and is a suggestion to Said Achmiz, jessicata, and others, posted here because this is currently the highest-placed comment on the page that follows a particular pattern.]

Not paying attention to the semantic content of this comment, but rather its structure, notice that it is a series of quotes, often of a single sentence, followed by similarly short replies. While this is a standard technique in forum arguments, I claim that is mostly for undesirable reasons (like it being optimized for "scoring points"), and have found that it's not a particularly helpful method of discussion.

My suggestion (and it is only a suggestion) is that you try an approach where you share your understanding of your interlocutor's whole point or position with a paraphrase, attempt to identify the most fruitful part of the disagreement to work on, and then devote the remainder of the comment to that point. This keeps discussions focused on moving forward, doesn't give an edge to the party with more attention to spare to the discussion, makes it harder to talk past one another repeatedly, and makes it easier to notice when core points are simply droppe... (read more)

[-]gwillen6y190

I made a proposal for a moderator tool that seems like it might have been helpful to this thread, partly in response to your bracketed text, and I'd be curious to hear your thoughts. https://github.com/LessWrong2/Lesswrong2/issues/610

6Ben Pace6y

Please write more proposals like this.

[-]quanticle6y100

Another question to ask, with regards to launching into unprompted explanations of one's personal life, is whether the other person actually wants that information. Like it or not, most people subscribe to the Copenhagen Interpretation of Ethics, which means by telling people of your problems, you are implicitly making it their problem (else why would you bother sharing?).

If I said, "Hi, how are you," and your response was a 5-minute long explanation of how your aunt died, your car broke down and how your dog needs surgery, my reaction will be awkward silence, not because I have no sympathy for your plight, but because I am wondering whether there is any obligation for me to step in and help.

9jessicata6y

I'm confused because it does seem like you think everyone should participate, i.e. you endorse people lying about their state if asked about it. (I didn't mean "everyone should go out of their way to participate" but rather "everyone should participate at all" e.g. when pressed to by someone else) I'm actually fine with this as a filter. Anyway, if someone's code of honesty is unable to actually resist the pressure of "if I were honest then it would be slightly awkward and some people would talk to me less" then it is doing very little work. I don't see why anyone would want such a code of "honesty" except to lie about how honest they are. The just-so story seems true based on my social models, and I would bet on it if it were possible. That's enough to base my behavior on. My main argument here is that (a) saying you're doing well when you're not doing well is literally false and (b) in context it's part of a pattern that optimizes against clarity, rather than something like acting that is clearly tagged as acting. I don't actually see a counterargument to (b) in the text of your comment.

2Said Achmiz6y

Ok, back up. What exactly are you talking about, here? “Participate” in what? What am I “participating” in when I respond to “How are you?” with something other than a true accounting of my current state? Originally, you asked about “participation” in But—as I alluded to in my response—there is a clear difference between the two halves of this “ritual”! Are you treating them as a single unit? If so—why? If not, then I am hard-pressed to parse your remarks. Please clarify. You seem to assume, here, that any “code of honesty” must use the same concept of “honesty” as you do. You may reasonably disagree with my understanding of what constitutes honesty, but it is tendentious to suggest that, in fact, I am using the same concept of honesty as you are, except that I am being dishonest about it. (In other words, you speak as if I share your values but am failing to live up to them, while hypocritically claiming otherwise. The obvious alternative account—one which I, in fact, suggested upthread—is that my values simply differ from yours.) This is nothing more than a circular restatement that you believe said just-so-story to be true. We already know that you believe this. Perhaps, but all this means is that the concept of “literal falsehood” which you are using is inadequate to the task of modeling human communication. (Once again, one person’s modus ponens…) The problem with such naive accounts of “truth” in communication is that they do not work—in a very real and precise sense—to predict the epistemic state of actual human beings after certain communicative acts have been undertaken. That should signal to you that something is wrong with your model. You’re going to have to unpack “clarity” a good bit—as well as explaining, at least in brief, why you consider it to be a desirable thing—before I can comment on this (beyond what I’ve already said, which I do not see that you’ve acknowledged).

1jessicata6y

If you should participate in the ritual as the B role, then you should participate in the ritual at all. This seems like a straightforward logical consequence? Like, "if you should play soccer as defense, then you should play soccer." What does "honesty" mean to you, and what is it for? Does honesty ever require doing things that are slightly awkward and cause fewer people to want to talk to you? It seems like you were implying that there was something illegitimate about me acting based on my social models? If you don't think this then there is no conflict here. I agree that you should not update on someone saying "X" by proceeding to condition your beliefs on "X." You at least have to take pragmatics and deception into account. I think this is a case of deception in addition to pragmatics, rather than pragmatics alone. You would not expect people saying "fine" if they were not fine from a model like the ones on this page where agents are trying to inform each other while taking into account the inferences others make; you would get it if there were pressure to deceive. "Clarity" means something like "information is being processed in a way that is obvious to all parties." For example, people are able to ask about what state each other are in, and either receive a true answer or a refusal to provide this information. When things aren't fine, this quickly becomes obvious to everyone. And so on. This is often desirable for a bunch of reasons. For example, if I can track what state my friends are in, I can think about what would improve their situations. If I know things aren't fine more generally, then I can investigate the crisis and think about what to do about it. This is not always desirable. For example, if someone were doing drugs and the police were questioning them about it, then it would probably be correct for them to optimize for unclarity by lying or misdirecting (assuming they can get away with it). But optimizing against clarity is pretty much the

4gjm6y

For saying "fine" when greeted with "how are you?" or "how's it going?" to be a case of deception in addition to pragmatics, it would need to be the case that the person saying "fine" expects to be understood as saying that their life is in fact going well. I don't think people generally expect that. (Though there's something kinda a bit like that that people maybe do expect. If my life is really terrible at the moment then maybe my desire for sympathy might outweigh my respect for standard conventions and make me answer "pretty bad, actually" instead of "fine"; so when I don't do that, I am giving some indication that my life isn't going toooo badly; so if it actually is but I still say "fine", maybe I'm being deceptive. But that's only the case in so far as, in fact, if my life were going badly enough then I would be likely not to do that.) Having this convention isn't (so it seems to me) "optimizing against clarity" in any strong sense. That is: sure, there are other possible conventions that would yield greater clarity, but it's not so clear that they're better that it makes sense to say that choosing this convention instead is "optimizing against clarity". (For comparison: imagine that someone proposes a different convention: whenever two people meet, they exchange bank balances and recent medical histories. This would indeed bring greater clarity; I don't think most of us would want that clarity; but it seems unfair to say that we're "optimizing against clarity" if we choose not to have it.)

0jessicata6y

Almost no one expects marketers to actually tell the truth about their products, and yet it seems pretty clear that marketing is deceptive. I think this has to do with common knowledge: even though nearly everyone knows marketing is deceptive, this isn't common knowledge to the point where an ad could contain the phrase "I am lying to you right now" without it being jarring. The convention is optimized for preventing people from giving information about their state that would break the narrative that Things Are Fine. People's mental processes during the conversation will actually be optimizing against breaking this narrative even in cases where it is false. See Ben's comment here.

4gjm6y

You may be right about marketing and common knowledge; if so, then I suggest that the standard "how are you? fine" script is common knowledge; everyone knows that a "fine" answer can be, and likely will be, given even if the person in question is not doing well at all. I agree that when executing the how-are-you-fine script people are ipso facto discouraged from giving information about their state that contradicts Things Are Fine. That's because when executing that script, no one is actually giving any information about their state at all. If you actually want to find out how someone's life is going, that isn't how you do it; you ask them some less stereotyped question and, if they trust you sufficiently, they will answer it. Again, if the how-are-you-fine script were taken as a serious attempt to extract (one one side) and provide (on the other) information about how someone's life is going, then for sure it would be deceptive. But that's not how anyone generally uses it, and I don't see a particular reason why it should be.

1jessicata6y

I was going to write a longer response but this thread covers what I wanted to say pretty well.

2Said Achmiz6y

You have made this sort of assertion several times now; I’d like to see some elaboration on it. What sorts of social contexts do you have in mind, when you say such things? On what basis do you make this sort of claim?

2jessicata6y

Person A and B are acquaintances. A asks B "how are you?" B is having serious problems at work, will probably be fired, and face serious economic consequences. B says "fine." Why did B say "fine" when B was in fact not fine? Suppose B said "I'm going to lose my job and be really poor for the near future." Prediction: this will be awkward. Why would this be awkward? Hypothesis: it is awkward because it contradicts the idea that things are fine. While this contradiction exists in the conversation, A and B will feel tension. Tension can be resolved in a few ways. A could say "oh don't worry, you can get another job," contradicting the idea that there is a problem in an unhelpful way that nevertheless restores the narrative that things are fine. A could also say "wow that really sucks, let me know if you need help" agreeing that things aren't fine and resolving the tension by offering assistance. But A might not want to actually offer assistance in some cases. A could also just say "wow that sucks;" this does not resolve the tension as much as in the previous case, but it does at least mean that A and B are currently agreeing that things aren't fine, and A has sympathy with B, which ameliorates the tension. Compare: rising action in a story, which produces tension that must be resolved somehow.

2Said Achmiz6y

I see. The account you present is rather abstract, and seems to be based on a sort of “narrative” view of social interactions. I am not sure I understand this view well enough to criticize it coherently; I also am not sure what motivates it. (It is also not obvious to me what could falsify the hypothesis in question, nor what it predicts, etc. Certainly I would appreciate a link or two to a more in-depth discussion of this sort of view.) In any case, there are some quite obvious alternate hypotheses, some of which have been mentioned elsethread, viz.: * The “Copenhagen interpretation of ethics” * Guarding against a disadvantageous change in power relations * A simple desire for privacy All of these alternate hypotheses (and similar ones) make use only of simple, straightforward interests and desires of individuals, and have no need to bring in abstract “narrative” concepts.

3Said Achmiz6y

I see, thanks. This is not how I would normally use the word “clarity” (which is why I said “it does no such thing” in response to your claim that the norm in question “optimizes against clarity”). That having been said, your usage is not terribly unreasonable, so I will not quibble with it. So, taking “clarity” to mean what you described… … I consider this sort of “clarity” to not be clearly desirable, even totally ignoring the sorts of “adversarial/zero-sum” situations you allude to. (In fact, it seems to me that a naive, unreflective dedication to “clarity” of this sort is particularly harmful in many categories of potentially-positive-sum interactions!) This is a topic which has been much-discussed in the rationalist meme-sphere (and beyond, of course!) over the last decade; I confess to being surprised that you appear to be unaware of what’s been said on the subject. (Or are you aware of it, but merely disagree with it all? But then it seems to me that you would, at least, not have been at all surprised by any of my comments…) I do not, at the moment, have the time to hunt for links to relevant writings, but I will try to make some time in the near future.

2Said Achmiz6y

Given the clarification in this subthread, let me now go ahead and respond to this bit: Indeed, you certainly would not; the problem, however, lies in the assumption that someone responding “Fine” to “How are you?” is trying to inform the asker, or that the asker expects to be informed when asking that question. In any case, this is a point we’ve covered elsethread.

2Said Achmiz6y

This is very bizarre logic, to be frank. The entire conception of such social interactions as coherent “rituals” that both the asker and the asked are willing “participants” in, qua ritual, is quite strange, and does not accord with anything I said, or any of my understanding of the world. That question is the genesis of quite a long discussion. I hardly think this is the time and place for it. That is certainly not out of the question. I don’t know about “illegitimate”, but basing your social models on unverified just-so-stories is epistemically unwise. What do you mean by “deception”, here? If a casual acquaintance greets me with “How are you?” and I respond with “Fine, you?”—in a case when, in fact, a monster truck has just run over my favorite elephant—do you consider this an instance of “deception”? If so, do you view “deception” as undesirable (in some general sense) or harmful (to the said casual acquaintance)? That page seems to be some sort of highly technical discussion, involving code in a language I’ve never heard of. Would you care to summarize its core ideas in plain language, or link to such a summary elsewhere? Failing that, I have no comment in response to this. (rest of your comment addressed in a separate response)

5jessicata6y

Re: "ritual," it seems like "social script" might have closer to the right connotations here. My main point here is that, if you are trying to build a reputation as being unusually honest, yet you lie because otherwise it would be slightly awkward and some people would talk to you less, then your reputation doesn't actually count for anything. If someone won't push against slight awkwardness to tell the truth about something only a little important, why would I expect them to push against a lot of awkwardness to tell the truth about something that is very important? By definitions of "honesty" commonly used in American culture, being unusually honest usually requires doing things that are awkward and might cause people to talk to you less. For example, in the myth about George Washington chopping down a cherry tree, it is in fact awkward for him to admit that he chopped down a cherry tree, and he could face social consequences as a result. But he admits it anyway, because he is honest. (Ironically this didn't actually happen, but this isn't that important if we are trying to figure out what concepts of honesty are in common usage) I would count saying "fine" when you are not fine to be a form of deception, one which is usually slightly harmful to both participants, but only slightly. For someone who is not attempting to be unusually honest as a matter of policy, this is not actually a big deal. It might be worth saying "fine" to minimize tension. But the situation is very different for someone attempting to be unusually honest as a matter of policy. This type of person is trying to tell the truth almost all the time, even when it is hard and goes against their local incentives. There may be some times when they should lie, but it should have to be a really good reason, not "it would be slightly awkward if I didn't lie." If someone is going to lie whenever the cost-benefit analysis looks at least as favorable to lying as it does in the "saying you are fine when y

3Said Achmiz6y

I assume you mean, infer that not all the apples are red? In any case, thanks for the summary. It sounds like it’s simply the Gricean maxims / the concept of implicature, which is certainly something I’m familiar with.

1jessicata6y

Whoops, thanks for the correction (edited comment).

2Said Achmiz6y

I don’t really know that this makes your comments about it any more reasonable-sounding, but in any case this sub-point seems like a tangent, so we can let it go, if you like. I just don’t think that this identification of “honesty” with “parsing spoken sentences in the most naively-literal possible way and then responding as if the intended meaning of your interlocutor’s utterance coincided with this literal reading” is very sensible. If someone did this, I wouldn’t think “boy, that guy/gal sure is unusually honest!”. I’d think “there goes a person who has, sadly, acquired a most inaccurate understanding, not to mention a most unproductive view, of social interactions”. Suppose you are asked a question, where all of the following are true: 1. Your interlocutor neither expects nor desires for you to take the question literally and answer it truthfully. 2. You know that you are not expected to, and you have no desire to, take the question literally and answer it truthfully. 3. Your interlocutor would be harmed by you taking the question literally and answering it truthfully. 4. You would be harmed by you taking the question literally and answering it truthfully. Do you maintain that, in such a case, “honesty” nevertheless demands that you do take the question literally and answer it truthfully? If so, then this “honesty” of yours seems to be a supreme undesirable trait to have, and for one’s friends and acquaintances to have. (I maintain the scare quotes, because I would certainly not assent to any definition of “honesty” which had the aforesaid property—and, importantly, I do not think that “honesty” of this type is more predictive of certain actually desirable and prosocial behaviors, of the type that most people would expect from a person who had the as-generally-understood virtue of honesty.) I would be interested to hear why you think this. It seems incorrect to me. Once again, you are relying on a very unrealistic characterization of what is taking p

4Benquo6y

These are not necessarily mutually exclusive explanations. Sometimes the point of a social transaction is to maintain some particular social fiction.

1jessicata6y

I don't make this identification, given that I think honesty is compatible with pragmatics and metaphor, both of which are attempts to communicate that go beyond this. I would identify honesty more with "trying to communicate in a way that causes the other person to have accurate beliefs, with a significant preference for saying literally true things by default." Depends on the situation. If it's actually common knowledge that the things I'm saying are not intended to be true statements (e.g. I'm participating in a skit) then of course not. Otherwise it seems at least a little dishonest. Being dishonest is not always bad, but someone trying to be unusually honest should avoid being dishonest for frivolous reasons. (Obviously, not everyone should try to be unusually honest in all contexts) If you're pretty often in situations where lying is advantageous, then maybe lying a lot is the right move. But if you are doing this then it would be meta-dishonest to say that you are trying to be unusually honest. I think saying false things routinely to some extent trains people to stop telling truth from falsity as a matter of habit. I don't have a strong case for this but it seems true according to my experience. This is a pretty severe misquote. Read what I wrote.

2Said Achmiz6y

Most of your comment seems to indicate that we’ve more or less reached the end of how much we can productively untangle our disagreement (at least, without full-length, top-level posts from one or both of us), but I would like to resolve this bit: Well, first of all, to the extent that it’s a quote (which only part of it is), it’s not a misquote, per se, because you really did write those words, in that order. I assume what you meant is that it is a misrepresentation/mischaracterization of what you said and meant—which I am entirely willing to accept! (It would simply mean that I misunderstood what you were getting at; that is not hard at all to believe.) So, could you explain in what way my attempted paraphrase/summary mischaracterized your point? I confess it does not seem to me to be a misrepresentation, except insofar as it brackets assumptions which, to me, seem both (a) flawed and unwarranted, and (b) not critical to the claim, per se (for all that they may be necessary to justify or support the claim).

4jessicata6y

Agreed that further engagement here on the disagreement is not that productive. Here's what I said: I am not saying that, if someone says they are fine when they are not fine, then necessarily they will lie about important things. They could be making an unprincipled exception. I am instead saying that, if they lied whenever the cost-benefit analysis looks at least as favorable to lying as in the "saying you are fine when you are not fine" case, then they're likely going to end up lying about some pretty important things that are really awkward to talk about.

3Benquo6y

I think that Said is arguing that they're making a *principled* exception. Vaniver's comment makes a decent case for this.

2Said Achmiz6y

Yes, this is correct. The exception is entirely principled (really, I’d say it’s not even an exception, in the sense that the situation is not within the category of those to which the rule applies in the first place).

3Said Achmiz6y

I see. It seems those assumptions I mentioned are ones which you consider much more important to your point than I consider them to be, which, I suppose, is not terribly surprising. (I do still think they are unwarranted.) I will have to consider turning what I’ve been trying to say here into a top-level post (which may be no more than a list of links and blurbs; as I said, there has been a good deal of discussion about this stuff already).

4Benquo6y

This norm actually prevents me from asking people how they are. I literally can't ask the question in those words. I can say the words, but they will be parsed as a social nicety, not as a literal question. Instead I have to participate in the expensive dance I described in that old blog post Raemon linked.

5Said Achmiz6y

To reiterate: As an n=1 data point that, I suppose, you may feel free to ignore… I can report that—despite myself being on the spectrum, and most of my friends being “nerds” of some description—I really do not have this supposed difficulty of being unable to ask people whom I care about and who I am close enough to that they are willing to share personal details with me questions about how their life is really going. Consider the possibility that if someone “mistakes” your supposed “question about their life” for a mere greeting, then that is not because these gosh-darn normie norms are getting in the way of Honesty™—but rather, it is because this person is not interested in baring their souls to you, and is using this very convenient and useful conversational norm to deflect your question by treating it as a mere greeting (or similar contentless conversational filler), making use of the plausible deniability the norm provides to avoid any awkwardness and the potential for loss of face on either side.

3Benquo6y

This doesn't seem like a response to what I actually said. If you took me to be implying something else, maybe you can explain what that is?

6Said Achmiz6y

Eh? I just reread your comment, three times. I don’t understand your objection. I seem to be responding directly to what you said.

3Benquo6y

To be a bit more direct, this seems like it begs the question: This seems to conflate two different levels of abstraction: That does in fact seem like a person motivated not to disclose information, lying in a socially approved way in order not to disclose that information. I'm not sure how to characterize that, if not as getting in the way of honesty. Not just honesty between the two of us, but also between other pairs where one or the other party doesn't know if they're in the same position or not.

4Said Achmiz6y

Not at all. I know perfectly well who the described people are, and who they are not, on a great deal of evidence other than whether I can ask them how they are and get an answer. I would not characterize it as “getting in the way of honesty”. I would only make this characterization if both parties were fully willing to be “honest” (i.e., straightforwardly communicate using the literal meaning of words), but were impeded or outright prevented from doing so due to norms like this. Whereas in cases such as I describe, one party has no desire at all to cooperate with the other party (by truthfully answering the asked question); the norm, then, does not “get in the way of honesty”, but rather serves to enable the desired evasion. Once again: a norm (or, indeed, anything else) can only properly be said to be “getting in the way of honesty” if “honesty” is intended, but prevented. Where it is not intended, saying that the norm is “getting in the way” is misleading, at best.

3Benquo6y

It gets in the way of honesty in something like the way liars get in the way of communication, or spam gets in the way of email. Liars aren't trying and failing to communicate the truth, but they're making it harder for truthtellers to be believed. Spam emails aren't trying to give me important information, but they're making it more expensive for me to read important emails.

6Raemon6y

Ah, this finally clarified the discussion for me. I still don’t think the problem is that bad, because it’s fairly easy for me to say ‘how are you actually?’ and it pretty much seems to work.

5Benquo6y

Also, remember that we're actually still dealing with the aftermath of a minor discourse disaster in which I accidentally cast a scapegoating spell (I really am sorry, Duncan!) against a person when trying to vividly criticize a policy proposal. (You correctly noted that I was using words in a way that were going to predictably generate adverse side effects.) I think the total cost of things like this is way higher than you're noticing, if you add up the additional interpretive labor burden, foregone discourse, and demon threads. Not saying there's an easy solution, or that we're not getting important nice things from the status quo, but the costs of this situation really are quite high.

4Benquo6y

I agree that in any one case it doesn't cost much - when you think of it - to actually ask the question. But the need to do that means that costs of really asking "how are you?" scale linearly with number of such interactions, and there's a strong affordance for asking the question the generally-recognized-as-fake way. This means that parts of your brain that attend to verbal narratives are getting trained on a bunch of experiences where you ask people how they are and they tell you they're fine (and vice versa). This plausibly leads to some systematic errors.

2Said Achmiz6y

These analogies, however, can hardly be apt, given that the one who asks “How are you?” does so knowingly (he can simply say something else, or nothing at all!), and also does not expect a truthful answer to the (literally-interpreted) question. What are the analogous aspects of the “liars” or “spam” scenarios?

3Benquo6y

Someone with an email account generally knows they will receive some spam. (You could instead refuse to look at your email, and then you'd never read spam!) Someone who lets people tell them things generally knows they will be lied to from time to time.

6Said Achmiz6y

This is not analogous, because whereas spam is not the desired and expected sort of received email, and a lie is not the desired and expected sort of received utterance, “Fine” (or similar phatic speech act) is precisely the expected response to “How are you?”. In other words, an (untruthful) answer of “Fine” (or similar) is not—unlike spam, or lies—a bad and wrong thing, that you nonetheless tacitly accept as an occasional cost-of-doing-business (much as you might accept that some apples you purchase may occasionally be bruised—regrettable, but that’s life). Rather, it is simply how the interaction is supposed to go. I am perplexed by your persistent inability to grasp this point. Once again: if we’re casual acquaintances, I greet you with an ordinary “How are you?”, and you respond by telling me about your life for five minutes, I will consider this to be defection on your part.

6Benquo6y

You're right that it's not a perfect analogy. However, to a spammer, sending a spam email and occasionally getting a response from a naive or gullible person is definitely how the process is supposed to go. They have a different agenda than I, a normal reader of emails, do, and theirs interferes with mine. Likewise, people who lie about how they are or punish others for not lying have one agenda, and I have another, and theirs interferes with mine. A more precise analogy would be VCs who won't fund startups that won't exaggerate their prospects or performance in standard ways. People working on a Y-Combinator startup actually told me this, I'm not just guessing here, they didn't initially think of it as lying but confirmed that a third party who took their representations literally would definitely be systematically misled into overvaluing the company. Cf. Ben Kuhn's post here.

3Said Achmiz6y

No, this is still not analogous. It would only be analogous if the receiver of the spam email also viewed receiving spam as “how the process is supposed to go”. Let us distinguish two cases. In the first case, “How are you?” is a greeting, and “Fine” is a reply. The former is not a question, and the latter is not an answer, and consequently it is not, and cannot be, a lie. In the second case, the asker really does intend to ask how the other person is; but the target has no desire to answer. In that case, “Fine” is, indeed, a lie. It is, however, a lie which the target has every right to tell (and any norm which condemns such lies is a bad one, and should be opposed). We can indeed analogize the second scenario to the “spam email” case. But it’s the asker who is the spammer in the analogy, not the target! That is: the asker is attempting to have an interaction which their target has no desire to have. The target, meanwhile, is acting in a way which is entirely right and proper, a way in which they have every right to act. (No comment on the startups thing; I have insufficient knowledge of the startup world to have serious opinions about how things go there.)

5Benquo6y

I suppose another way of thinking about this might be that in contexts where there is a sufficiently strong expectation that one will say certain words as part of a social ritual, with implications that are at best very indirectly related to the literal meaning of those words, "lie" is a type error. On this model, we could just say that "How are you?" handshakes are using actor norms rather than scribe norms. What I'm saying is that it's not at all just a chance coincidence that the actor norms happen to use words that sound like they mean something specific in scribe dialect. The scribe-dialect meaning functions as a sort of jumping-off point for the formation of a customary social action. This has the important side effect of preventing people from unambiguously using those words in scribe-dialect. The accumulated effect of all such transformations is a huge drag on efficiency of communication about anything there's not already an established custom for.

2Said Achmiz6y

Indeed, it certainly is not a chance coincidence; as I explained elsethread, that the handshake sounds like a question allows it to serve the additional, and highly useful, function of granting someone plausible deniability for deflecting actual prying/questioning with non-answers in a socially acceptable way. (My comments about “power relations” say more on this.)

3Benquo6y

I haven't said they're acting wrongly; I've said that they're lying in a socially sanctioned way. If you don't think these are distinct claims, why not?

6Zvi6y

I wonder how much of the problem is exactly this. Claiming someone is lying is by default, claiming that someone is doing something wrong. So if something isn't wrong, it must not be lying - thus saying things 'aren't really lying' rather than biting the bullet and saying that lying is OK in a situation. This does seem to break down in sufficiently clear circumstances (e.g. the Gestapo searching for Jews in the attic) but even then I think there's a strong instinctual sense in which people doing this don't consider it lying.

8Benquo6y

Also, it seems to me as though when people evaluate the "Jews in the attic" hypothetical, "Gestapo" isn't being mapped onto the actual historical institution, but to a vague sense of who's a sufficiently hated adversary that it's widely considered legitimate to slash their tires. In Nazi Germany, this actually maps onto Jews, not the Gestapo. It maps onto the Gestapo for post-WWII Americans considering a weird hypothetical. To do the work of causing this to reliably map onto the Gestapo in Nazi Germany, you have to talk about the situation in which almost everyone around you seems to agree that the Gestapo might be a little harsh but the Jews are a dangerous, deceptive adversary and need to be rooted out. Otherwise you just get illusion of transparency.

7Benquo6y

Related: arguments ostensibly for a policy of universal "honesty" or "integrity," on the basis of "adopt the policy you'd be rewarded for if people could inspect the policy directly," tend to conflate lying with saying socially disapproved-of things. In fact people will punish you for lying when you're supposed to tell the truth, and for telling the truth when you're supposed to lie, and largely reward you for conforming with shared fictions.

3Said Achmiz6y

Note this comment, where I clearly distinguish between the case of “not actually lying” and “lying, but lying is perfectly OK in this circumstance”.

5Benquo6y

Weakman. What about simply "Horrible!"?

2Said Achmiz6y

Hardly. Same.

3Benquo6y

Where did Jessica propose an unencouraged five-minute monologue? "Horribly!" usually takes far less time to pronounce.

2Said Achmiz6y

In the linked comment.

2Benquo6y

I can't find anything in the linked comment that says that.

4Said Achmiz6y

Is your quibble that this does not literally specify a duration of exactly five minutes? How long do you think it takes to “actually [talk] about things that have been going on in [one’s] life lately”? Is it four minutes? Three minutes? Is five right out? Might it, in fact, sometimes take six minutes, or even seven?

6jessicata6y

To be clear, I usually just talk about one thing, and then that jumps off into some other discussion. Sorry for the confusing wording.

4Benquo6y

Maybe we're talking past each other. What do you think my position is, and what about it seems like it reflects a failure to grasp that point?

3Benquo6y

I'm actually just saying this norm imposes substantial costs by impeding communication. You indicated that you would punish people for answering the literal question honestly: I'm pointing out that a norm of punishing such behavior prevents me from actually asking the question in the most straightforward terms available. This substantially increases the cost of this sort of communication. It seems like you're assuming a follow-up argument along the lines of: therefore, it doesn't make sense that someone might locally want to follow the norm or be protected by it. But I'm actually not saying that. I'm just saying that punishing people for taking "how are you?" literally prevents some communication.

4Said Achmiz6y

Yes. I understand what you’re saying. What I would like you to either acknowledge as correct, or clearly state your disagreement with, is the proposition that this result you describe constitutes the said social norm working as intended.

5Benquo6y

Yes, that's the social norm working as intended, insofar as "intent" is a relevant paradigm here.

[-]Raemon4y130Nomination for 2018 Review

This was the post that got the concept of a "robust agent" to click into place in my mind. It also more concretely got me thinking about what it means to be honest, and how to go about that in a world that sometimes punished honesty.

Since then, I've thought a lot about meta-honesty as well as meta-trust (in contexts that are less about truth/falsehood). I have some half-finished posts on the topic I hope to share at some point.

This also had some concrete impacts on how I think about the LessWrong team's integrity, which made it's way into several conversations that (I'd guess?) made their way into habryka's post on Integrity, as well as my Robust Agency for People and Organizations.

[-]Dagon6y110

I prefer not to lie, but there are so many cases where the weight of projected futures is overwhelmingly in favor of lying that I can't call it a rule, or give much moral weight to it.

Lying has costs (it's unpleasant, if found out, reduces trust, etc.). Truth-telling has costs (hurt feelings, punishments, etc.). Silence (including Glomar's response) has a cost (much the same as both lying and truth-telling). Weighing costs and benefits of actions (including communication and signals to other entities) is what we do.

Any sane decision theory will choose "lie" in some inputs, "truth" in others, and "silence" in still others.

Note: I do subscribe to the (rejected by you) notion that

rationalists ought to practice lying, so that they could separate their internal honesty from any fears of needing to say what they believed.

Belief is different from communication, which is different from signaling/manipulation. They are all mixed up in different proportions in different contexts, and trying to generally solve for one without acknowledging the others is likely to lead to pain.

I also think that the idea of "honesty" AND by extensio... (read more)

2BoilingLeadBath6y

I'm pretty sure you are correct that honesty is a sort of signaling thing, but I do not find it possible to "join in the signaling when it is useful" - it seems to me that evidence as to the honesty / dishonesty of a person usually accumulates slowly, so you more-or-less have to pick a strategy and stick to it. (My personal experience is that I have a hard time getting people to believe the things I say even when I'm ~100% honest, and that my persuasiveness goes down hill rapidly if I dial that back.) In the usual situation where the gestapo questions you, I think you are correct. However, the hypothetical was unusual in that: 1) The gestapo agent is fluent in meta-honesty 2) The gestapo agent knows that you firmly adhere to a code of meta-honesty 3) #1 and #2 are common knowledge between you and the gestapo agent Together, these (as Eliezer notes, very unlikely) requirements mean that not "playing the meta-honesty game" by directly lying is in fact a strong tell of object-level dishonesty - why would you break your code of protecting your counterfactual selves if you were not hiding *actual* jews? (Or at least nervous because of the proximal authority figure.) Again, I agree that in reality, this falls apart - for instance, without #1 your response reads as prevarication, and without #3 you'd likely lie and be caught lying. (It is interesting that, unless I'm missing something, you don't have to assume #2 - if the agent doesn't know that you're meta-honest, you don't get punished for that strategy; you just don't get the benefit from your long history of honest meta-honest conduct.)

2Dagon6y

Yes, intentional signaling is hard, and the easiest way to do it is to just be honest most of the time. I follow and recommend this (though I do NOT recommend truthful-but-misleading linguistic games in most cases, and don't make much moral distinction between that and lying. It is more deniable, so more convenient signaling). But I don't hesitate to diverge when it's clearly positive-value. To be clear, I recommend meta-lie-ing to the gestapo as well. Claim that I've gone further down the road since they read my blogs and taken a vow of absolute object-level truth. Claim that I've discovered meta-meta-truth, which prevents withholding information even if not asked, and confess some other minor crime instead. Whatever it takes to get them to leave.

[-]Stuart_Armstrong6y90

My own honesty pledge would be summarised as something like this:

1) I will try hard to not mislead you, unless the circumstances are extreme (Jews in the attic).

2) Consequently, I will lie in vacuous social interactions ("How was my play?" "It was fine"), but will tell only the truth if pressed.

3) I may, and will, refuse to answer your question, for reasons that are valid or just randomly. This may not involve explicit Glomarizing; but I will explicitly Glomarize if pressed.

4) I won't meta-lie.

On point 3), I think semi-inconsistent Glomarizing is almost as good as carefully strategic Glomarizing - indeed, it may be better, if it makes you less predictable.

[-]michael_vassar26y80

Wasn't your old rule officially "don't lie to someone unless you would also feel good about slashing their tires given the opportunity?" Or something very close to that? That already solves the standard Kantean problems.

[-]Hazard6y160

This chunk felt like the biggest difference between meta-honesty and "tire slash":

Harry shook his head. "No," said Harry, "because then if we weren't enemies, you would still never really be able to trust what I say even assuming me to abide by my code of honesty. You would have to worry that maybe I secretly thought you were an enemy and didn't tell you.

If I'm following the old rule, you probably want to know in what situations I'd feel good slashing your tires. If I actually felt okay slashing your tires, I'd probably also be invested in making you falsely belief I wouldn't slash your tires. This makes it hard to super soundly, within one's honesty code, let someone know when you would or wouldn't be lying to them.

If I'm following meta-honesty, it seems like I can say, "I wouldn't lie to you about being on your side unless XYZ doomsday scenario", and that claim is as sound as my claim to be meta-honest. Now, if I say I'm on your side (not going to slash tires / lie), and you trust my claim to be meta-honest, you can believe me with whatever probability you assign to us not currently being in a doomsday scenario.

7Rob Bensinger6y

The quotation is from Black Belt Bayesian:

[-]orthonormal4y70Review for 2018 Review

I don't recommend this post for the Best-of-2018 Review.

It's an exploration of a fascinating idea, but it'skind of messy and unusually difficult to understand (in the later sections). Moreover, the author isn't even sure whether it's a good concept or one that will be abused, and in addition worries about it becoming a popularized/bastardized concept in a wider circle. (Compare what happened to "virtue signaling".)

[-]Zack_M_Davis4y160

one that will be abused, and in addition worries about it becoming a popularized/bastardized concept in a wider circle. (Compare what happened to "virtue signaling".)

This is a terrible rationale! Our charter is to advance the art of human rationality—to discover the correct concepts for understanding reality. I just don't think you can optimize for "not abusable/bastardizable if marketed in the wrong way to the wrong people" without compromising on correctness.

Concepts like "the intelligence explosion" or "acausal negotiation" are absolutely rife for abuse (as we have seen), but we don't, and shouldn't, let that have any impact on our work understanding AI takeoff scenarios or how to write computer programs that reason about each other's source code.

And likewise "virtue signaling." Signaling is a really important topic in economics and evolution and game theory more generally. If we were doing a Best-of-2014 review and someone had written a good post titled "Virtue Signaling", I would want that post to be judged for its contribution to our collective understanding, not on whatever misuse or confusion someone, somewhere might subsequently have attached to the same two-word phrase

... (read more)

3TAG4y

You still end up with a shot foot. People tend to confuse solving problems with apportioning blame -- I call it the "guns don't kill people" fallacy.

[-]OrthernLight6y70

(Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)

This does bother me, but I’ve come to the conclusion that “How are you?” usually isn’t really a question - it’s a protocol, and the password you’re supposed to reply with is “Fine.” Almost no-one will take this to mean that you actually are fine, in my experience - they will take it to mean that you are following the normal rules of conversation, which is true. It’s much like how I can tell jokes, use idio... (read more)

1Yosarian T4y

Yeah, this correct. Also, I think "I'm fine" generally literally true in the narrow sense, since it's literally true that, for example, I am not in urgent need of medical attention at this moment. If I was literally bleeding to death and someone asked my how I was and I said "I'm fine", people would take that to be a falsehood in some sense. But if I'm physically healthy but emotionally upset, and someone asked me how I was and I said "fine", people don't consider that a lie, because it isn't one, in the narrowest sense; I am "fine" in one sense of the word. Which is also why it so often becomes the default answer, because it's almost never a lie, in at least the very narrow "wizard's rule" sense.

[-]Raemon4y60Review for 2018 Review

This is probably the post I got the most value out of in 2018. This is not so much because the precise ideas (although I have got value out of the principle of meta-honesty, directly), but because it was an attempt to understand and resolve a confusing, difficult domain. Eliezer explores various issues facing meta-honesty – the privilege inherent in being fast-talking enough to remain honesty in tricky domains, and the various subtleties of meta-honesty that might make it too subtly a set of rules to coordinate around.

This illustration of "how to contend w

... (read more)

[-]Jiro6y60

Given how people actually act, a norm of "no literal falsehoods, but you can say deceptive but literally true things" will encourage deception in a way that "no deception unless really necessary" will not. "It's literally true, so it isn't lying" will easily slip to "it's literally true, so it isn't very deceptive", which will lead to people being more willing to deceive.

It's also something that only Jedi, certain religious believers, autists, Internet rationalists, and a few other odd groups would think is a good idea. "It isn't lying because what I said was literally true" is a proposition that most people see as sophistry.

[-]Rob Bensinger6y100

Eliezer mostly talks about the idea that 'No literal lies' isn't morally necessary, but I take it from the "your sentences never provided Bayesian evidence in the wrong direction" goal that he also wouldn't consider this morally sufficient.

[-]Dagon6y60

I tend to separate the topics of "I prefer to be honest" and "I prefer others to be honest". Both are true, but I approach them very differently. For myself, I pretty much set the default but allow that I'll deviate if I think it's long-term more valuable to do so.

For others, I try to approach it as "permission to be honest". I let people know that I prefer the truth, and I will do my best not to punish them for delivering it efficiently. This is similar to Crocker's Rules not being automatically symmetric... (read more)

[-]Eli Tyre4y50Nomination for 2018 Review

One of my favorite posts, that encouraged me to rethink and redesign my honesty policy.

[-]Swimmer963 (Miranda Dixon-Luinenburg) 4y50Nomination for 2018 Review

Used as research for my EA/rationality novel, I found this interesting and useful (albeit very meta and thus sometimes hard to follow).

[-]Benquo6y50

I request that we stop using the Nazis as an example of a go-to fantasy adversary like vampires or zombies. The Gestapo was an actual institution that did real things for particular reasons. "Jews in the attic" shouldn't be parsed as a weird hypothetical like Kant's "murderer at the door" - it's a historical event. You can go to the library and read a copy of Anne Frank's diary.

On another post there was recently a demon thread in which I'm partially at fault, but an important contributing factor was that I was trying to point out specif... (read more)

[-]Rob Bensinger6y100

Upvoted, and I agree with this concern, though I also think I'd have had a harder time digesting and updating on Eliezer's example if he'd picked something more fantastical. Using historical examples, even when a lot of the historical particulars are irrelevant, helps remind my brain that things in the discussed reference class actually occur in my environment.

[-]Benquo6y180

I agree that historical examples can be helpful. I suspect these can be even more helpful if people vary the examples so they don't wear down into tropes, and check whether the details plausibly match. My reply to Zvi here seems relevant:

It seems to me as though when people evaluate the "Jews in the attic" hypothetical, "Gestapo" isn't being mapped onto the actual historical institution, but to a vague sense of who's a sufficiently hated adversary that it's widely considered legitimate to "slash their tires." In Nazi Germany, this actually maps onto Jews, not the Gestapo. It maps onto the Gestapo for post-WWII Americans considering a weird hypothetical.

To do the work of causing this to reliably map onto the Gestapo in Nazi Germany, you have to talk about the situation in which almost everyone around you seems to agree that the Gestapo might be a little scary but the Jews are dangerous, deceptive fantasy villains and need to be rooted out. Otherwise you just get illusion of transparency.

6Zvi6y

I felt bad about using it as the example in my comment, feeling the OP should have picked a different example, but did it anyway because the OP did it. Agreed this was an error, we should use Nazis if and only if we actually mean Nazis, and find a better go-to example. Thoughts on what this should be? Kant's literal 'murderer at the door' feels arbitrary and lame to me.

6Benquo6y

Something that would capture the feeling I think most people have in the Nazis example would be that a soldier from a foreign occupying army is looking for a peaceful dissident hiding in your attic. Genuinely not sure if that's actually a great example for a thing like this.

6Benquo6y

To be clear, I'm not proposing a bright-line rule against this, just saying that I get the sense that it's an example being selected because it's a well-worn groove to pattern-match "enemy" to "Nazi", not because it was particularly illuminating in this case.

[-]Ben Pace4y40

Eliezer discusses the fact that replying “I’m fine” to “How are you?” is literally false. In case anyone’s interested, one answer I've taken to using in response to "How are you?" is “High variance," which is helpfully vague about the direction of the variance.

[-]Ruby4y40Nomination for 2018 Review

This has definitely among the top posts that has stuck with me. My instincts are very strongly towards wanting to always be maximally honest, but of course that's not perfectly practical. This post works to recover a principled relationship to truth-telling and honesty even in the face of real-world necessity to sometimes not maximally promote truth.

[-]Jan_Kulveit6y40

I'm afraid this would not work for me as too much information would be "leaking thorough the side channels". By that I mean that while I can probably do the reasoning and give non-revealing replies in writing, anyone good at noticing emotions, small movements, delays, etc. would probably be able to learn what I'm trying or not trying to hide on the object level quite easily.

(This may mean that if you are too honest person on S1 level, it's plausible you cannot use some strategies on S2 level.)

[-]SKEM7mo30

Finally!

Allow me an excursion which is not meant to subsume Eliezer Yudkowsky under Immanuel Kant or vice versa. It is intended to depict what I regard as a related thought process, and point out where I see people often getting sidetracked with regards to what's actually the issue (to my understanding).

Back in the Philosophy seminar on Kants prohibition on lying I felt everyone was missing the point and that this (to my understanding) was it:

Sometimes there is no "right thing" you can easily choose. Sometimes your choice is between the bad and the worse. ... (read more)

[-]Patrick Long4y20

Ummm... if the feds are questioning you about a potential criminal act, Glomarization is almost always the best answer, because the 5th Amendment gives you a right to do it. And yes, they will take that to mean you did the thing, but they can't legally do anything with that. So worst case is they bs around that, dig a little harder to find evidence on you they probably would have found anyway, and pretend they were always going to try that hard to find the evidence. But the practical reality is the FBI usually only asks you questions they know the ans... (read more)

[-]Said Achmiz6y20

It’s a human kind of thinking to verbally insist that “Don’t kill” is an an absolute rule, why, it’s right up there in the Ten Commandments. Except that what soldiers do doesn’t count, at least if they’re on the right side of the war. And sure, it’s also okay to kill a crazy person with a gun who’s in the middle of shooting up a school, because that’s just not what the absolute law “Don’t kill” means, you know!
Why? Because any rule that’s not labeled “absolute, no exceptions” lacks weight in people’s minds. So you have to perform that the “Don’t kill” com

... (read more)

8Rob Bensinger6y

"Murder" is plausibly a better translation of the original Hebrew, but "kill" is still more common in the English-speaking world. I'm also not seeing how it weighs against Eliezer's point if some absolute rules are worded so as to guard against loopholes, and other absolute rules aren't. Foundational moral rules that count as counter-examples to Eliezer's generalization should look more like "instructions that explicitly encourage people to use their personal judgment about when and how to apply the rule and subjectively factor in idiosyncratic contextual information". I don't think that choosing a pejorative word instead of a more neutral word achieves this. E.g., if "first, do no harm to your patient" were "first, do no harm to your patient unless it looks like a good idea" then Eliezer's generalization would look less plausible; but changing it to "first, do no violence to your patient" or "first, commit no sins against your patient" would count as little or no counter-evidence even though it's more loophole-resistant and builds in stronger evaluative/normative content and connotations.

3quanticle6y

It absolutely counts as counter-evidence to me. Words have meanings. "Harm" is not the same as "violence". Neither is it the same as a "sin". The first meaning says, "Do not do anything to harm your patient, regardless of intent. If you do something well intentioned, but with bad results, you are still morally at fault." The second means, "Do not take intentional action to harm your patient." The third means, "Do your best to act in your patient's best interest, but you will not be held morally at fault if bad results occur." Those all all different, and the latter two are way looser in my opinion than, "Do no harm."

-6Said Achmiz6y

[-]mike_hawke1mo10

If everyone in town magically receives the same speedup in their "verbal footwork", is that good for meta-honesty? I would like some kind of story explaining why it wouldn't be neutral.

Point for yes:
Sure seems like being able to quickly think up an appropriately nonspecific reference class when being questioned about a specific hypothetical does not make it harder for anyone else to do the same.

Point against:

The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their v

... (read more)

[-]mike_hawke2mo1-6

Most people, even most unusually honest people, wander about their lives in a fog of internal distortions of reality. Repeatedly asking yourself of every sentence you say aloud to another person, "Is this statement actually and literally true?", helps you build a skill for navigating out of your internal smog of not-quite-truths. For that is our mastery.

I think some people who read this post ought to reverse this advice. The advice I would give to those people is: if you're constantly forcing every little claim you make through a literalism filter, you mig... (read more)

[-]NoriMori19927mo10

I'm curious how you feel about this response.

[-]Peter Gerdes6y10

If you actually follow the advice about glomarization it is no longer improbable that you will be interrogated by someone who has read the rationalist literature on the subject and thought through the consequences. Investigators do their homework and being committed enough to glomarize frequently enough to do the intended work is a feature that will stick out like a sore thumb when your associates are interviews and immediately send the investigator out to read the literature.

Now maybe most investigators aren’t anywhere near this through but if you are facing an investigator who doesn’t even bother looking into your normal behavior your glomarization is irrelevant anyway.

[-]totallybogus6y10

...I theoretically ought to answer “I can’t confirm or deny what I was doing last night” because some of my counterfactual selves were hiding fugitive marijuana sellers from the Feds. ...

This seems easy to fix in principle. If, conditioned on the info that's known, or that probabilistically might be known to your asker, your counterfactual selves were especially likely to hide fugitives, you ought to say "I can’t confirm or deny"; otherwise, you can be truthful, and accept the consequence that some negligible fraction of your counterfactual se... (read more)

[-]musicmage41146y-10

Regarding meta-honesty:

I'm going to flip the usual jargon on its head and say that I "agree connotatively, but disagree denotatively".

Meta-honesty - that is, "honesty about honesty" - is, like many meta-concepts, interesting to think about, but I don't quite understand why it needs to be formulated as some sort of "code". As you've presented it here, this "meta-honesty code" seems largely intractable in normal communication, and comes across as an overly-complicated way of simply refusing to hold up "Do not lie&... (read more)

5Rob Bensinger6y

There's nothing inconsistent about saying that some action class A is a subset of B, that all actions in A are impermissible, and that some actions in B are permissible. So I don't understand what inconsistency you're pointing to here. Maybe your point is that "lie" feels like a natural category in a way that "meta-lie" doesn't, so basing your clear bright moral lines around the latter category feels unduly arbitrary?

2musicmage41146y

You've actually hit the nail right on the head and put my thoughts into words I couldn't quite find, thank you. Any moral code that contains non-absolute rules (in this case, "Don't lie, except when...") will of course require some amount of arbitrariness to distinguish it from the infinite range of other possibilities, but given the amount of difficulty the prohibition on "meta-lies" introduces if you decide to also uphold the prohibition on gathering object-level information, it definitely feels excessively arbitrary. Really, the whole thing would work just fine if we were to pick just one of those restrictions: either don't gather object-level information (but be free to meta-lie), or don't meta-lie (but be okay with gathering object-level information). Dealing with both is, as far as I'm concerned, intractable to the point of uselessness.