The Belief Signaling Trilemma

by Scott Garrabrant1 min read20th Sep 201349 comments

11

Personal Blog

One major issue for group rationality is signaling with dishonest beliefs. Beliefs are used as a signal to preserve your reputation or to show membership in a group. It happens subconsciously, and I believe it is the cause of most of the issues of both religions and political parties. It also happens on Less Wrong, most commonly though not sharing beliefs that you think people will disagree with.

First, let's identify the problem. This is mostly from the viewpoint of ideal agents as opposed to actual humans. You are part of a community of rationalists. You discuss lots of issues, and you become familiar with the other members of your community. As a result, you start to learn which members of the community are the smartest. Of course, your measure of the intelligence of the members is biased towards people who say things that you agree with. The members who say things that you agree with build up more reputation in your mind. This reputation makes you more likely to trust other things that this person says. I am also a member of this community. I have opinions on many things, but there is one issue that I think really does not matter at all. On this issue, most of the community believes X, but I believe not X. By signaling belief in X, I increase my reputation in the community, and will cause other people to take more seriously my views on other issues I think are more important. Therefore, I choose to signal belief in X.

What is happening here is that:

(A) People are assigning reputation based on claims, and judging claims based partially on other beliefs signaled by the same person.

(B) People want their claims are taken seriously, and so take actions which will preserve and improve their reputation.

Therefore,

(C) People take signal beliefs that they believe are false because they are shared by the community.

Signaling honest beliefs is kind of like cooperating in a prisoners dilemma. It helps the community push towards reaching what you believe are valid conclusions, at the cost of your own reputation. It is possible for us to decide as a community that we want to cooperate, especially with tools such as anti-kibitzer. However, there is more than one way to do that. I think there are three options. I think they are all theoretically possible, but I think they are all bad.

(1) We can agree to stop assigning reputation based on beliefs.

This option is bad because there is a loss of information. People who made the right choice on one issue are more likely to make the right choice on other issues, and we are ignoring this correlation.

(2) We can agree to always report honest beliefs even though we know it will cost us reputation.

This option is bad because it encourages self-deception. If you commit to honestly report beliefs, and you can gain more reputation by reporting belief in X, you may trick yourself into thinking that you believe X.

(3) We can allow dishonest reporting of beliefs to continue.

This option is bad because it causes a bias. The community will get a source of evidence biased towards their current beliefs.

Which option do you prefer? Am I missing a fourth option? Is one of the choices obviously the best or obviously the worst? Should we combine them somehow? Am I modeling the problem entirely wrong?

11

49 comments, sorted by Highlighting new comments since Today at 7:35 PM
New Comment

(1) We can agree to stop assigning reputation based on beliefs.

I don't think that this is a realistic option, even ignoring the loss of information. Sure, if David Icke signed up on Less Wrong, I could claim that I was assigning his beliefs an equal amount of reputation as everyone else's... but I wouldn't actually be doing that, and neither would anyone else here who claimed to.

(2) We can agree to always report honest beliefs even though we know it will cost us reputation.

This one doesn't seem realistic either. We can certainly try to encourage people to always report their honest beliefs, but there's no way you can enforce that, and many people will keep hiding beliefs if they think that revealing those beliefs would hurt them. Note that this would also force many people to use pseudonyms, since any unpopular opinions that are posted here will also be visible to Google, which might hurt them in real life.

I think that it's important to signal that we don't expect anyone to have psychologically unrealistic levels of altruism or community commitment. People are going to act according to their own self-interest here, and that's perfectly fine as long as it doesn't actively hurt the community. Associating with LW shouldn't make anyone feel actively uncomfortable, and compulsory radical honesty is going to make a lot of people feel uncomfortable. Keeping some of your beliefs as secrets is fine; everyone here has their right to mental privacy.

Also, self-censorship of publicly expressed beliefs also has value in improving the signal to noise ratio. If someone here really did believe that shape-shifting reptilian aliens controlled all the major governments, we wouldn't want them to regularly bring up this position.

I agree that it would be hard to enforce 2. This really is like a prisoners dilemma. 1 however is very possible to do in an online community by not posting usernames with comments.

1 however is very possible to do in an online community by not posting usernames with comments.

As a result you get the quality of discussion that exist on 4Chan and venues such as YouTube commends where nobody knows each other.

By signaling belief in X, I increase my reputation in the community, and will cause other people to take more seriously my views on other issues I think are more important. Therefore, I choose to signal belief in X.

See, this kind of comment makes me conclude I'm a weirdo from another planet. Feigning beliefs for approval is something I suppose I've engaged in a time or two, and to the best of my recollection, quickly felt the need for a shower. And on an internet forum? What the hell is the point? So that people can respect some fictitious creature who isn't me?

I certainly don't do this here. Why would I play the part of someone with views I disagree with, for approval from others exactly where I believe they are wrong? How peculiar. How distasteful. Ugh. I'm glad I was born from a different species.

Me, I want the respect of people I respect, the people I think are right. Why would I pander to the wrong? Ugh.

Why would I play the part of someone with views I disagree with, for approval from others exactly where I believe they are wrong? How peculiar. How distasteful. Ugh. I'm glad I was born from a different species.

Me, I want the respect of people I respect, the people I think are right. Why would I pander to the wrong? Ugh.

I know what you mean, and can't think of any times I've done this off the top of my head (although I have stayed silent or rephrased in order to ensure conversations continued.)

However.

Respect can be instrumentally useful. Don't be too quick to declare some Dark Art taboo, you may need it someday.

(Not saying you were, of course.)

I'm all Dark Arts as instrumental rationality, in theory.

In practice, there are certain activities I just find revolting. The bile rises. Catering to nitwittism in others is one of those activities. And I don't wish to send a signal to those who might matter to me that I'm both dishonest and a slave to the opinions of the herd, or to them. People who would be attracted to that signal are not people I wish to attract.

Rationalists should win. If your goals are better fulfilled by not playing with the herd, then don't play with the herd. If your goal are better fulfilled by playing with the herd, then play with the herd.

With regards to signalling, the problem is that people can't read your thoughts. The see that you do X, but they don't know why you do X. They can often make a wrong hypothesis. It's not like when you do X, there is a bubble above your head saying "I am doing X because of Y". For example you may refuse to participate in some group activity for ethical or philosophical reasons, but others may think you simply lack the required skills. So there is a risk that these signals will be weaker or different than you imagine.

Also, being a "slave to the opinions of the herd" has unnecessarily bad connotations. To balance this, the word "minister" originally meant "servant"; and the Catholic Pops calls himself "Servant of the servants of God". So, some kinds of servitude are actually positions of great power. If for some instrumental purposes you want to lead the herd, you must understand the herd and avoid violating their assumptions, because if you act too strangely, the herd will not follow. There is an NLP technique called "pacing and leading", which means you start by following and then gradually switch to leading, and if you succeed to not break the mood, people will follow to preserve the perceived status quo.

I am not trying to convince you to do things you find revolting. If you really find them revolting, you would probably do them wrong even if you tried. Just saying that from a different point of view, they don't have to be revolting.

The comment shouldn't make you feel like a weirdo. I made the comment, and I fully agree with your opinions. I don't think I have ever feigned beliefs for approval, and in the rare cases where I keep quiet and don't express opposing beliefs, I feel horrible. However I do think this is a huge issue for the rest of the world, and less wrong has a lot of opinions of the form "X is obvious and if you don't believe X you are crazy" which makes me worry that this might be an issue for some here. Also, your comment mostly just says 3 is bad. I think the more interesting question is probably 1 or 2.

Feigning beliefs for approval is something I suppose I've engaged in a time or two, and to the best of my recollection, quickly felt the need for a shower

I agree - there is rarely conscious feigning of belief for want of benefits. People who do this are actually "evil" mutants. A neurotical human would feel guilty, even neurotypical humans who don't actually care about truth feel guilty when they lie.

in the rare cases where I keep quiet and don't express opposing beliefs, I feel horrible. However I do think this is a huge issue for the rest of the world

The rest of the world are not evil mutants - see here. They behave as if they are, but they are operating using similar cognitive machinery as yourself. So here are some alternative explanations for the behavior:

1) There is often withholding beliefs for fear of social repercussions. Lies by omission are less guilt inducing.

2) There is often being sincerely convinced (or sincerely uncertain about something you would be certain about) for the wrong reason - because your social group believes it. This is most common for beliefs that have no practical consequences, or beliefs for which the practical consequences are non-obvious.

3) There is often belief-in-belief (thinking that you believe in accordance with your social group, but then not acting that way and failing to see the contradiction.) Basically self-deception: there is an instinctive imperative not to lie, it doesn't matter if we make that an explicit norm.

It's hard to actually notice when you are doing these things.

But I think you're still correct that changing the social incentive structures would change the behavior.

The rest of the world are not evil mutants

I disagree. Oh, they're not so evil, but they may as well be a different species.

You know how Haidt has different moral modalities, with different people putting consistently different weights to those modalities? I think the same thing occurs with truth modalities. For some people, the truth is whatever is socially useful. To such a creature, values that I hold dear such as honesty and knowledge are simply alien and irrelevant issues. To me, they are Clippy, looking to turn me into a paperclip. And the bad news for me is that on this planet, I'm the weirdo alien from another planet, not them.

I'm saying that people do value honesty, but can't pursue it as a value effectively because of faulty cognitive machinery, poor epistemic skills, and a dislike of admitting (even to themselves) that they are wrong. I think that the Lesswrong / rationalist / skeptic community tends to be comprised of folks with superior cognitive machinery and epistemic skills in this dimension. When people say "I value honesty" they believe that they are speaking honestly, even if they aren't entirely sure what truth means.

As I see it, you're saying that people do not value honesty and purposefully choose to ignore it in favor of other, more instrumental values. And you extend this trait to the Lesswrong / rationalist / skeptic community as well. When people say "I value honesty", in their mind they know it to be a lie but do not care. If they were to ever say "I consider truth to be whatever is socially useful", in their mind they would believe that this an honest statement.

Both our hypotheses explain the same phenomenon. My mental disagreement flowchart says that it is time to ask the following questions:

0) Did I state your point and derive its logical implications correctly? Do you find my point coherent, even if it's wrong?

1) Do you have evidence (anecdotal or otherwise) which favors your hypothesis above mine?

(My evidence is that neurotypical humans experience guilt when being dishonest, and this makes being dishonest difficult. Do you dispute the truth of this evidence? Alternatively, do you dispute that this evidence increases the likelihood of my hypothesis?)

2) Do you stake a claim to parsimony? I do, since my hypothesis relies entirely on what we already know about biases and variations in the ability to think logically.

1) There is often withholding beliefs for fear of social repercussions. Lies by omission are less guilt inducing.

This is exactly the phenomenon that I was trying to say is a big problem for a lot of people. I do not think that the rest of the world directly lies that much, but I do think they lie by omission a lot because of social pressure.

When I talk about dishonest signalling with beliefs, I am including both lies by omission like in 1, and subconscious lies like 3. 2, however, is an entirely different issue.

less wrong has a lot of opinions of the form "X is obvious and if you don't believe X you are crazy"

This strikes me as a problem of presentation more than anything else. I've had computer science professors whose lecture style contained a lot of "X is obvious and if you don't believe X you are crazy" -- which was extremely jarring at first, as I came into a CS graduate program from a non-CS background, and didn't have a lot of the formal/academic background that my classmates did. Once I brought myself up to speed, I had the methods I needed to evaluate the obviousness and validity of various Xs, but until then, I sure didn't open my mouth in class a lot.

In the classes I TAed, I strove for a lecture style of "X is the case and if you don't understand why then it's my job to help you connect the dots." That was for 101-level classes, which LW is, at least to my impression, not; if the Sequences are the curriculum for undergraduate rationality, LW is kind of like the grad student lounge. But not, like, a snooty exclusive one, anyone's welcome to hang out and contribute. Still, the focus is on contribute, so that's at least perceived social pressure to perform up to a certain standard -- for me it's experientially very similar to the grad school experience I described.

We're talking about opinions here rather than theorems, and there's a distinction I want to draw between opinions that are speculations about something and opinions that are ... personal? ... but I'm having trouble articulating it; I will try again later, but wanted to point out that this experience you describe generalizes beyond LW.

I don't think this is a very big problem: an additional factor is that people are uncertain about what the community believes and what expressed beliefs harm reputation. They may expect others to agree with them (the "silent majority"), or at least open to be convinced.

People also vary in what kind of reputation they care about (some love being the Daring Rebel Speaking Truth To Power); so between dissenters that don't know they're dissenters, and those who don't care, enough different points of view should be introduced.

The comments on this post went in a direction that I was not intending. When I wrote this post, I wasn't thinking about less wrong in particular, but instead I thought that this was a universal phenomenon. I was trying to make the point that under any circumstances you must use at least one of the three options to some extent, and that this was sad because all three have their own unique downsides.

In a typical situation, the usual strategy is to have a few close friends you are honest with, and with all other people don't ruin the illusion that you share the group beliefs. Do the minimum that is required to avoid suspicion. Pick your battles carefully.

Paul Graham wrote about this: What You Can't Say, Re: What You Can't Say, Labels.

(2) We can agree to always report honest beliefs even though we know it will cost us reputation.

I don't think reputation is the only concern when it comes to revealing beliefs. If I make an argument to convince people of A I don't have to also say B.

Even if I don't believe that an AGI will be discovered in the next 100 years I can still particpate in a discussion about AI safety and assume for the sake of the discussion that an AGI will be discovered in the next 100 years.

It's interesting that you think the problem with 2 is that it encourages self-deception. I would think the problem would be the loss of reputation.

Anyway, of the three you presented, I choose 2. For as long as I can remember, I've been speaking honestly about my beliefs, which is part of the reason my reputation hasn't been very high throughout much of my life. I still think 2 is the best option, though I'm attempting to change how I express my beliefs to reduce the cost to my reputation.

It's interesting that you think the problem with 2 is that it encourages self-deception.

By the way, so did Eliezer:

I worry that Radical Honesty would selectively disadvantage rationalists in human relationships. Broadcasting your opinions is much easier when you can deceive yourself about anything you'd feel uncomfortable saying to others. I wonder whether practitioners of Radical Honesty tend to become more adept at self-deception, as they stop being able to tell white lies or admit private thoughts to themselves. I have taken a less restrictive kind of honesty upon myself - to avoid statements that are literally false - and I know that this becomes more and more difficult, more and more of a disadvantage, as I deceive myself less and less.

I think that if there is ever a vow of honesty among rationalists, it will be restricted in scope. Normally, perhaps, you would avoid making statements that were literally false, and be ready to accept brutal honesty from anyone who first said "Crocker's Rules". Maybe you would be Radically Honest, but only with others who had taken a vow of Radical Honesty, and who understood the trust required to tell someone the truth.

Probably there is a risk of not conscious insincerity, but simply our "elephants" betraying us. Whenever you honestly say an unpopular thing, you feel the social punishment, and you are conditioned against the whole causal chain that resulted in this event. Some parts of the chain, such as deciding to speak openly, you consciously refuse to change. So your "elephant" may change some previous part, such as believing something that is against the group consensus, or noticing that your beliefs are different and you should comment on that.

To prevent this, we would also need a policy to not punish people socially for saying unpopular things. But if we make this a general rule, we could die by pacifism. So... perhaps doing this only in specific situations, with people already filtered by some other process?

As an individual the biggest problem is the loss of reputation. However, if the community as a whole agrees to follow strategy 2, that should cancel out.

I don't want to propose a solution without some idea of how common a problem this is, because that affects which one is optimal.

How many beliefs (If drawing lines between separate beliefs is an issue, call them separate if you have separate reasons for disagreeing with the perceived popular opinion on LessWrong on them) do you hold but claim not to hold for the purpose of signaling epistemic credibility on LessWrong? (Don't rationalize "I only agreed under pressure that one time." If most of the time you have been asked what you believe, you lied, include that belief in this count) [pollid:554]

How many beliefs do you hold, and not lie about, but also avoid discussing on LessWrong for the same purpose? (Beliefs you do not discuss because they are rude is a separate issue) [pollid:555]

Make sure you don't accidentally uncheck the anonymity thing.

I try to never truly lie, but I do sometimes not say all what I think, or abstain from discussion, or say a watered-down version of what I believe, because I hold a belief that will cost me reputation in a given community, and that applies on LW too.

But it's hard to quantify that as a number, because belief are interconnected. Usually in a given community, most of the belief I don't expose too openly are linked together to a set of entangled core belief. Like transhumanism in my family circle. Obviously I won't say on LW ;)

I'm glad you said "a watered-down version of what I believe." I've used that strategy on LW to avoid having my arguments dismissed on account of guilt-by-association with some pet peeve of my audience. I will go out of my way to avoid sounding like a postmodernist, say, even on a particular issue whereupon what some postmodernist has said is dead right and exactly to the point.

Upvoted for giving a specific example of an issue where these problems occur.

I think the data from this poll will be biased towards showing that belief signaling is not an issue, because I think a lot of it is subconscious. People think they actually believe the beliefs they use as signals. I think this is the primary problem of religion.

[-][anonymous]7y 2

Using "Karma" as a stand in for Epistemic credibility, I definitely have thoughts/beliefs which I then avoided discussing on Less Wrong, but the way I viewed that was that I simply didn't have time to properly research/structure my comments on them to seem good enough to gain points or stay neutral.

(As a side note, I had other thoughts/beliefs about this post that I felt I didn't have time to properly research/structure: Edit: And I ended up posting them below this line.)

But I suppose another way of thinking about this is "I'm afraid this belief might be wrong AND seem over confident, and other people will judge me for that. Let me try to make sure this post is more solid and has all necessary caveats before saying anything." I don't think that's the same thing. But I guess it is possible that I'm lying to myself about that somehow?

For instance, here is a belief I would have ordinarily censored in regards to this, If I weren't already kind of being Meta:

"This hurts me. I tend to incorrectly interpret this as being worried about being wrong at all times and I'm continually on the lookout for harsh reprimands or worse when none should be reasonably expected (or they should be expected but should be brushed off), to the point where I recently had to get my doctor to boost my dose of anti-anxiety pills."

But NOW I'm afraid that I'm making that more true by acknowledging it out loud, where as, if I DIDN'T post that, my thoughts on the matter would be slightly different?

And honestly, at this point I would ordinarily think "I'm lost in my own head. I should discard most of this as gibberish and move on."

Except that the entire point of this was trying to honestly consider my own thought processes, so I probably should post this even if it seems stream of conciousnessy.

...So when my thoughts are like that, it's really hard to boil them down to a number.

There are beliefs that I generally don't talk about on Less Wrong, but only because the contexts in which it would be relevant to bring them up are rare.

[-][anonymous]7y 0

I can't think of any beliefs which I avoid talking about here because holding them would violate LW norms or signal low status in this context. However, I can think of several beliefs which I avoid talking about because they belong to domains that are considered harmful or disruptive to bring up.

Typically, those domains hold that status because we've tried talking about them and it hasn't ended well.

[This comment is no longer endorsed by its author]Reply

(Beliefs you do not discuss because they are rude is a separate issue)

What do you mean by rude?

Beliefs where the transgression that reduces status is saying them, not believing them. (Distinct from saying them being used as evidence of the transgression of believing them). If someone read your mind and discovered them, you would not have committed the transgression with a rude belief, but you would with a stupid belief.

A third category (which I think you might think the belief you're trying to classify fits into) is where it's believing it that is the transgression, not because it's believed to be wrong, losing you epistemic credibility, but because the belief is repulsive, losing you another kind of status. If losing status besides epistemic credibility is your motivation for avoiding discussing this belief, do not include it. However, if you believe that people's disgust for your belief will cause you to lose epistemic credibility in their eyes as a side effect, and that is your motivation (by this I mean it's a necessary condition) for not discussing it, include it.

(4) See people who are willing to honestly report beliefs contrarian to the group in which they are expressed as high status.

I don't know what you mean by this.

If you have a problem that people lose reputation for advocating out group beliefs than it's makes sense to reward people for voicing those beliefs.

So will you reward me for posting my highly contrarian belief that the earth is flat? ;)

So will you reward me for posting my highly contrarian belief that the earth is flat? ;)

Given my priors I wouldn't think that you are honestly reporting a belief if you would claim that the earth is flat.

I'm glad you brought this up - I hadn't known what to call it, but I also sensed an issue along these lines. (What I observed was that I feel social pressure not to deviate too far from what most members of the site believe. This does not necessarily mean anyone is trying to exert this social pressure, or even that they are doing anything in particular other than voicing similar beliefs.)

I think this will take more than just agreeing on a policy for assigning reputation or voicing opinions, though. My inner homo erectus will try to protect me from reputation damage regardless of whether reputation damage could actually occur, and there are probably also unconscious factors influencing how I precieve others' social status.

But maybe we can hack this somehow? How can we artificially influence our feelings about social status to counterbalance our natural tendencies? (Eating a cookie immediately after saying things I was afraid to say is something I've tried in the past, in a different context. That doesn't seem like it would work as a community-wide norm, but an individual could try doing it themselves.)

I absolutely agree that if this model is correct, it is mostly subconscious, and the solutions I suggested were from a very non-human agent standpoint, but I think the 3 solutions are all possible even for humans, especially in an online community.

A fourth option, online community specific, could be something like this: Allow members to post their comments anonymously, under following conditions:

Only users with enough karma can post anonymously. (This is to prevent abuse by trolls.) You pay some karma for posting anonymously, which is a constant number not depending on whether people agree with you or disagree with you. (To prevent abuse by serious members.) If you comment anonymously again in the thread below your original anonymous comment, you don't pay karma again, and the website shows that this is the same anonymous person, for example by using "Anonymous#1". (So for example people can ask for clarifications and you can explain freely.)

The anonymous comments can also be upvoted and downvoted, and hidden if they get too much downvoted, but their karma is not reflected on their user. Also, the social effect is that other people don't know who you are, even if they downvote you.

I don't know how much useful this would be. The feeling of being downvoted could still be unpleasant. But it would be some balance between the ability to write anonymously, and possible troll abuse.

Can't pretty much anybody comment anonymously with the system as it is, without the automatic karma penalty but otherwise behaving as above, by creating an account called "anonymous#1" (or whatever version of that hasn't been taken, or "sockpuppet17", or whatever)?

Sure, if it's too far downvoted the comment will be hidden and the account silenced, but the comment has still been made anonymously. And granted, this doesn't allow for a continuing conversation if the initial comment is heavily downvoted, but I think that's considered a feature here.

I considered making a throwaway recently, but account setup now requires email verification, and I'm sufficiently paranoid that this was enough of an obstacle to stop me from doing so.

Given how easy it is to create throw away (Edit: e-mail) accounts, this is a trivial inconvenience.

Can you explain what you're trying to express with this comment?

Something like mailinator should suffice for LW throwaway anonymity.

I have a suspicion that your option number 2 is already baked pretty deep into actual humans' psychologies.

Of course, your measure of the intelligence of the members is biased towards people who say things that you agree with.

This isn't actually necessary for the trilemma. Suppose a lot of people hold belief X for irrational reasons, whereas initially rationalists are split between X and ~X. Then holding belief ~X is evidence that you are above average rationality, note that this is true even for rational observers who believe X.

[-][anonymous]7y 0

[Option 2] is bad because it encourages self-deception.

Only if each person cares about their own reputation, which is the case for “actual humans” but not for “ideal agents”.

We don't have enough coordination to implement (1) or (2). (3) is going to happen, but we can mitigate the bias by keeping in mind that consensus we see may be fake (and how much it might be fake). There's still the problem of potentially worthwhile opinions never coming to our attention. Maybe if your real opinion is worthwhile, you should use a sock puppet account to post it? (Email addresses are easy to get).

On the other hand, maybe you expect to lose epistemic credibility for your real opinion because it is actually not worthwhile, in which case encouraging you to post it on a sock puppet account is a bad idea. Maybe there should be a "secret opinions thread" where people posted what they really believed with sock puppet accounts. Or we could just use another site like Reddit.

I want to add that I think that all three of these options are legitimate enough to consider, and I think that there are people who follow all of them. I don't always succeed, but I feel a moral push and try very hard to live my life using strategy 2, at least in situations that allow for an actual debate between the sides. However, I know people who I believe would say they think everyone should follow strategy 1, and I think that that most people follow strategy 3 most of the time without thinking about it at all.