Epistemic status: This is my own work, though I asked Valentine for feedback on an early draft. I'm confident that the mechanisms underlying the Hostile/Friendly Telepath dynamic are closely related. The problems of the friendly telepath problems seem well-established. I am less confident, or at least can't make as strong a claim, on the relation to selfhood/the self.
Valentine's "Hostile Telepaths" is about what your mind will do when you have to deal with people who can read your emotions and intentions well enough to discover when you are not thinking what they want you to think, and you have to expect punishment for what they find. In such a scenario, your mind will make being-read less dangerous in one of multiple ways, for example, by warping what you can discover in yourself and know about your own intentions.
If that doesn't seem plausible to you, I recommend reading Valentine's post first. Otherwise, this post mostly stands on its own and describes a different but sort of symmetric or opposite case.
"Telepathy," or being legible for other people, isn't only a danger. It also has benefits for collaboration. As in Valentine's post, I mean by "telepath" people who can partly tell if you are being honest, whether you are afraid, whether you will stick to your commitments, or in other ways seem to know what you are thinking. I will show that such telepathy is part of a lot of everyday effective cooperation. And beyond that, I will ask: What if a lot of what we call "having a self" is built, developmentally and socially, because it makes cooperation possible by reducing complexity? I will also argue that Valentine's Occlumency, i.e., hiding your intentions from others (and yourself), is not only a defensive hack. It can also function as a commitment device: it makes the conscious interaction between such telepaths trustworthy.
Two sides of one coin
In the hostile-telepath setting, the world contains someone who can infer your inner state and who uses that access against you.
That creates pressure in two directions: You can reduce legibility to them by using privacy, misdirection, strategic silence, or other such means. Or you can reduce legibility even to yourself when self-knowledge is itself dangerous. If I can't know it, I can't be punished for it.
Valentine's post is mostly about the second move: the mind sometimes protects itself by making the truth harder to access.
But consider the flip side: Suppose the "telepath" is not hunting for reasons to punish you. Suppose they're a caregiver, a teammate, a friend, a partner, or someone trying to coordinate with you. Then, being legible is valuable. Not because everything about you must be transparent, but because the other person needs a stable, simple, efficient way to predict and align with you:
Is he hungry or scared?
Does she mean yes?
Will he be there at 7 PM?
Is this a real preference or a polite reflex?
A big part of "becoming a person" is becoming the kind of being with whom other people can coordinate. At first, caregivers, then peers, then institutions.
The interface
If you squint, a self is an interface to a person. Here is a nice illustration by Kevin Simler that gives the idea:
But Kevin Simler is talking about the stable result of the socialization process (also discussed by Tomasello[1]). The roles/masks we are wearing as adults, whether we are aware of them or not. I want to talk about the parts prior to the mask. The mask is a standardized interface, but the pointy and gooey parts are also a lower-level interface.
The origin of the interface
I have a four-month-old baby. She can't speak. She has needs, distress, curiosity, but except for single of comfort or discomfort, she has no way to communicate, and much less to negotiate.
my 4-month-old daughter, slightly GPT-edited for anonymity
I can't coordinate with "the whole baby." I can't look into its mind. I can't even remember all the details of its behavior. I can only go by behavior patterns that are readable: different types of crying, moving, and looking. New behaviors or ones I already know (maybe from its siblings).
Over time, the baby becomes more legible. And they are surprisingly effective at it[2]. But how? Not by exposing every internal detail, but by developing stable handles that I or others can learn:
A baby shows distinguishable signals: I can recognize different cries and movements when it is tired vs hungry vs in pain vs bored (especially the latter is common).
It develops increasingly consistent preferences. Not so much for toys yet, but it never took the pacifier, for example. This includes getting bored by or being interested in specific things.
There is beginning anticipation and collaboration: If I diaper it, it will be calm.
Eventually, children will be able to make and hold commitments: "I will do it."
So I interpret its behaviors in a form that is legible to me. I have a limited number of noisy samples and can't memorize all details, so I compress into something I can handle ("it likes bathing in the evening") - with all the issues that brings - (oops, it likes bathing when X, Y, but not Z" many of which I don't know).
Vygotsky[3] describes how this interpretation of children's behavior gives rise to the interface by interpretation and mutual reinforcement.
From the outside, the baby starts to look like "a person." From the inside, I imagine, it starts to feel like "me."
Mandatory constraints
It is in the interest of the baby to be legible, and so it is for people. If other people know our interface, we can cooperate more effectively. And it also seems like more information and less noise is better to get a clearer reading and less errors when compressing each others communication. This may sound like an argument for radical transparency or radical honesty: if legibility is good, why not make everything transparent to others and also to yourself? But consider: what happens if you could edit everything on demand? The interface stops being informative. What makes coordination possible is constraint. A partner can treat a signal as evidence only if it isn't infinitely plastic. Examples:
If you could instantly rewrite fear, then "I'm scared" becomes less diagnostic and more negotiational.
If you could rewrite attachment on command, then "I care about you" is no longer a commitment.
If you can always generate the perfect emotion on cue, then sincerity becomes hard to distinguish from performance.
So some opacity doesn't just hide - it stabilizes. It makes your actions real in the boring sense: they reflect constraints in the system, not only strategic choices.
All things being equal, we prefer to associate with people who will never murder us, rather than people who will only murder us when it would be good to do so - because we personally calculate good with a term for our existence. People with an irrational, compelling commitment are more trustworthy than people compelled by rational or utilitarian concerns (Schelling's Strategy of Conflict) -- Shockwave's comment here (emphasis mine). See also Thomas C. Schelling's "Strategy of Conflict"
So opacity to ourselves can not only function as a defence against hostile opponents, but also enables cooperation of others with us. As long as we don't know why we consistently behave in a predictable way, we can't systematically change it. Opacity enables commitment.
But not all types of opacity or transparency are equal. When people argue about "transparency," they often conflate at least three things:
Opacity to others: Privacy, boundaries, and omission (strategic information hiding).
Opacity to self: You don't fully know what's driving you; the process and input that gives rise to your thoughts is at least partly hidden.
Opacity to will: You can't rewrite your motivations on demand.
What is special about cooperation here is that you can want more legibility at the interface while still needing less legibility in the implementation.
A good team doesn't need to see every desire. It needs reliable commitments and predictably named states.
Failure modes
Valentine discusses some ways self-deception can go wrong; for example, it can mislabel the problem as a “personal flaw” or show up as akrasia, mind fog, or distraction. Se should also expect the the reverse direction, the coordination interface, to have failure modes. Which ones can you think of? Here are four illustrations for common failure modes. Can you decode which ones they are before reading on?
I liked the idea of becoming "enlightened" and "letting go of my ego." I believed I could learn to use my time and energy for the benefit of other people and put away my 'selfish' desires to help myself, and even thought this was desirable. This backfired as I became a people-pleaser, and still find it hard to put my needs ahead of other peoples to this day.
You become legible, but the legibility is optimized for being approved of, not for being true.
The tell is if your "self" changes with the audience; you feel managed rather than coordinated.
a friend breezed into lunch nearly an hour late and, without a hint of remorse, announced, “I’m hopeless with time.” Her casual self-acceptance reminded me of my younger self. From my current vantage point, I see how easily “that’s just who I am” becomes a shield against any real effort to adapt.
Sometimes "this is just who I am" is a boundary. Sometimes it's a refusal to update. Everybody has interacted with stubborn people. This is a common pattern. But you can tell if it is adaptive if it leads to better cooperation. Does it make you more predictable and easy rto cooperate with, or does it mainly shut down negotiations that might (!) overwhelm you?
The rule of equal and opposite advice advice applies here. There is such a thing as asserting too few boundaries. For a long time, I had difficulties asserting my boundaries. I was pretty good at avoiding trouble, but it didn't give people a way to know and include my boundaries in their calculations - and in many cases where I avoided people and situations, they would have happily adapted.
Permanently letting go of self-judgement is tricky. Many people have an inner critic in their heads, running ongoing commentary and judging their every move. People without inner voices can have a corresponding feeling about themselves - a habitual scepticism and derisiveness.
So you can develop an inner critic: an internalized role of, maybe of a parent or teacher, that audits feelings and demands that you "really mean it."
Then you're surveilled from the inside. I think this shows up as immature defenses[4], numbness or theatricality.
Ozy: It took me a while to have enough of a sense of the food I like for “make a list of the food I like” to be a viable grocery-list-making strategy.
Scott: I’ve got to admit I’m confused and intrigued by your “don’t know my own preferences” thing.
Ozy: Hrm. Well, it’s sort of like… you know how sometimes you pretend to like something because it’s high-status, and if you do it well enough you _actually believe_ you like the thing? Unless I pay a lot of attention _all_ my preferences end up being not “what I actually enjoy” but like “what is high status” or “what will keep people from getting angry at me”
A benevolent friend can help you name what you feel. But there's a trap: outsourcing selfhood and endorsed preferences.
Do you only know what you want after someone (or the generalized "other people") wants it too?
If you want to balance the trade-offs between being legible and opaque, what would you do? Perhaps:
Be predictable without being exposed.
Be legible in commitments; be selectively private about details.
Prefer "I don't know yet" over a confident story that doesn't fit.
Some contexts are structurally hostile; don't try to win them with more openness.
Use rituals that don't demand instant inner conformity. Sometimes "I'm sorry" is the appropriate, even if the emotion is lagging.
"I like Italian food" is legible but doesn't help with choosing a restaurant. Sometimes you have to tell more. But often the details just add noise.
What I can't answer is the deeper question: What is a stable way of having a "self as an interface" that is stable under both coordination pressure (friendly telepaths) and adversarial pressure (hostile telepaths)? Bonus points if you can also preserve autonomy and truth-tracking.
how humans put their heads together with others in acts of so-called shared intentionality, or "we" intentionality. When individuals participate with others in collaborative activities, together they form joint goals and joint attention, which then create individual roles and individual perspeсtives that must be coordinated within them (Moll and Tomasello, 2007). Moreover, there is a deep continuity between such concrete manifestations of joint action and attention and more abstract cultural practices and products such as cultural institutions, which are structured-indeed, created-by agreedupon social conventions and norms (Tomasello, 2009). In general, humans are able to coordinate with others, in a way that other primates seemingly are not, to form a "we" that acts as a kind of plural agent to create everything from a collaborative hunting party to a cultural institution.
We propose that human communication is specifically adapted to allow the transmission of generic knowledge between individuals. Such a communication system, which we call ‘natural pedagogy’, enables fast and efficient social learning of cognitively opaque cultural knowledge that would be hard to acquire relying on purely observational learning mechanisms alone. We argue that human infants are prepared to be at the receptive side of natural pedagogy (i) by being sensitive to ostensive signals that indicate that they are being addressed by communication, (ii) by developing referential expectations in ostensive contexts and (iii) by being biased to interpret ostensive-referential communication as conveying information that is kind-relevant and generalizable.
We caIl the internal reconstruction of an external operation intentionalization. A good example of this process may be found in the development of pointing. Initially, this gesture is nothing more than an unsuccessful attempt to grasp something, a movement aimed at a certain object which designates forthcoming activity. The child attempts to grasp an object placed beyond his reach; his hands, stretched toward that object, remain poised in the air. His fingers make grasping movements. At this initial stage pointing is represented by the child's movement, which seems to be pointing to an object-that and nothing more.
When the mother comes to the child's aid and realizes his movement indicates something, the situation changes fundamentally, Pointing becomes a gesture for others. The child's unsuccessful attempt engenders a reaction not from the object he seeks but from another person. Consequently, the primary meaning of that unsuccessful grasping movement is established by others. Only later, when the child can link his unsuccessful grasping movements to the objective situation as a whole, does he begin to understand this movement as pointing. At this juncture there occurs a change in that movement's function: from an object-oriented movement it becomes a movement aimed at another person, a means of establishing relations, The grasping movement changes to the act of pointing.
Companion to "The Hostile Telepaths Problem" (by Valentine)
Epistemic status: This is my own work, though I asked Valentine for feedback on an early draft. I'm confident that the mechanisms underlying the Hostile/Friendly Telepath dynamic are closely related. The problems of the friendly telepath problems seem well-established. I am less confident, or at least can't make as strong a claim, on the relation to selfhood/the self.
Valentine's "Hostile Telepaths" is about what your mind will do when you have to deal with people who can read your emotions and intentions well enough to discover when you are not thinking what they want you to think, and you have to expect punishment for what they find. In such a scenario, your mind will make being-read less dangerous in one of multiple ways, for example, by warping what you can discover in yourself and know about your own intentions.
If that doesn't seem plausible to you, I recommend reading Valentine's post first. Otherwise, this post mostly stands on its own and describes a different but sort of symmetric or opposite case.
"Telepathy," or being legible for other people, isn't only a danger. It also has benefits for collaboration. As in Valentine's post, I mean by "telepath" people who can partly tell if you are being honest, whether you are afraid, whether you will stick to your commitments, or in other ways seem to know what you are thinking. I will show that such telepathy is part of a lot of everyday effective cooperation. And beyond that, I will ask: What if a lot of what we call "having a self" is built, developmentally and socially, because it makes cooperation possible by reducing complexity? I will also argue that Valentine's Occlumency, i.e., hiding your intentions from others (and yourself), is not only a defensive hack. It can also function as a commitment device: it makes the conscious interaction between such telepaths trustworthy.
Two sides of one coin
In the hostile-telepath setting, the world contains someone who can infer your inner state and who uses that access against you.
That creates pressure in two directions: You can reduce legibility to them by using privacy, misdirection, strategic silence, or other such means. Or you can reduce legibility even to yourself when self-knowledge is itself dangerous. If I can't know it, I can't be punished for it.
Valentine's post is mostly about the second move: the mind sometimes protects itself by making the truth harder to access.
But consider the flip side: Suppose the "telepath" is not hunting for reasons to punish you. Suppose they're a caregiver, a teammate, a friend, a partner, or someone trying to coordinate with you. Then, being legible is valuable. Not because everything about you must be transparent, but because the other person needs a stable, simple, efficient way to predict and align with you:
A big part of "becoming a person" is becoming the kind of being with whom other people can coordinate. At first, caregivers, then peers, then institutions.
The interface
If you squint, a self is an interface to a person. Here is a nice illustration by Kevin Simler that gives the idea:
But Kevin Simler is talking about the stable result of the socialization process (also discussed by Tomasello[1]). The roles/masks we are wearing as adults, whether we are aware of them or not. I want to talk about the parts prior to the mask. The mask is a standardized interface, but the pointy and gooey parts are also a lower-level interface.
The origin of the interface
I have a four-month-old baby. She can't speak. She has needs, distress, curiosity, but except for single of comfort or discomfort, she has no way to communicate, and much less to negotiate.
I can't coordinate with "the whole baby." I can't look into its mind. I can't even remember all the details of its behavior. I can only go by behavior patterns that are readable: different types of crying, moving, and looking. New behaviors or ones I already know (maybe from its siblings).
Over time, the baby becomes more legible. And they are surprisingly effective at it[2]. But how? Not by exposing every internal detail, but by developing stable handles that I or others can learn:
So I interpret its behaviors in a form that is legible to me. I have a limited number of noisy samples and can't memorize all details, so I compress into something I can handle ("it likes bathing in the evening") - with all the issues that brings - (oops, it likes bathing when X, Y, but not Z" many of which I don't know).
Vygotsky[3] describes how this interpretation of children's behavior gives rise to the interface by interpretation and mutual reinforcement.
From the outside, the baby starts to look like "a person." From the inside, I imagine, it starts to feel like "me."
Mandatory constraints
It is in the interest of the baby to be legible, and so it is for people. If other people know our interface, we can cooperate more effectively. And it also seems like more information and less noise is better to get a clearer reading and less errors when compressing each others communication. This may sound like an argument for radical transparency or radical honesty: if legibility is good, why not make everything transparent to others and also to yourself? But consider: what happens if you could edit everything on demand? The interface stops being informative. What makes coordination possible is constraint. A partner can treat a signal as evidence only if it isn't infinitely plastic. Examples:
So some opacity doesn't just hide - it stabilizes. It makes your actions real in the boring sense: they reflect constraints in the system, not only strategic choices.
So opacity to ourselves can not only function as a defence against hostile opponents, but also enables cooperation of others with us. As long as we don't know why we consistently behave in a predictable way, we can't systematically change it. Opacity enables commitment.
But not all types of opacity or transparency are equal. When people argue about "transparency," they often conflate at least three things:
What is special about cooperation here is that you can want more legibility at the interface while still needing less legibility in the implementation.
A good team doesn't need to see every desire. It needs reliable commitments and predictably named states.
Failure modes
Valentine discusses some ways self-deception can go wrong; for example, it can mislabel the problem as a “personal flaw” or show up as akrasia, mind fog, or distraction. Se should also expect the the reverse direction, the coordination interface, to have failure modes. Which ones can you think of? Here are four illustrations for common failure modes. Can you decode which ones they are before reading on?
Performative transparency
In A non-mystical explanation of insight meditation... Kaj writes:
You become legible, but the legibility is optimized for being approved of, not for being true.
The tell is if your "self" changes with the audience; you feel managed rather than coordinated.
Rigidity dressed up as authenticity
In When Being Yourself Becomes an Excuse for Not Changing Fiona writes:
Sometimes "this is just who I am" is a boundary. Sometimes it's a refusal to update. Everybody has interacted with stubborn people. This is a common pattern. But you can tell if it is adaptive if it leads to better cooperation. Does it make you more predictable and easy rto cooperate with, or does it mainly shut down negotiations that might (!) overwhelm you?
The rule of equal and opposite advice advice applies here. There is such a thing as asserting too few boundaries. For a long time, I had difficulties asserting my boundaries. I was pretty good at avoiding trouble, but it didn't give people a way to know and include my boundaries in their calculations - and in many cases where I avoided people and situations, they would have happily adapted.
The internal hostile telepath
In Dealing with Awkwardness, Jonathan writes:
So you can develop an inner critic: an internalized role of, maybe of a parent or teacher, that audits feelings and demands that you "really mean it."
Then you're surveilled from the inside. I think this shows up as immature defenses[4], numbness or theatricality.
Dependency on mirrors
In What Universal Human Experiences are You Missing without Realizing it? Scott Alexander recounts:
A benevolent friend can help you name what you feel. But there's a trap: outsourcing selfhood and endorsed preferences.
Do you only know what you want after someone (or the generalized "other people") wants it too?
If you want to balance the trade-offs between being legible and opaque, what would you do? Perhaps:
What does the latter mean?
What I can't answer is the deeper question: What is a stable way of having a "self as an interface" that is stable under both coordination pressure (friendly telepaths) and adversarial pressure (hostile telepaths)? Bonus points if you can also preserve autonomy and truth-tracking.
Tomasello (2014) A natural history of human thinking, page 4
Natural Pedagogy by Csibra & Gergely (2009), in Trends in Cognitive Sciences pages 148–153
Vygotsky (1978), Mind in society, page 56
See Defence mechanism. A writeup is in Understanding Level 2: Immature Psychological Defense Mechanisms. I first read about mature and immature defenses in Aging Well.