Löbian emotional processing of emergent cooperation: an example

9 min readNo comments

Ω 8

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Epistemic status: my opinion based on a mix of math, reflection, and speculation; not backed up by any systematic psychological studies.

Summary: Since my 2019 paper generalizing Löb's Theorem, a couple dozen people have asked me if the way humans naturally cooperate might be well-described by Löb's Theorem.  In short, my answer is probably sometimes, and in this post I'll try using an example to convey what that means.  Importantly, Löb's Theorem is not a theorem when applied to real-world humans and emotions — i.e., when its hypotheses are met, its conclusion is only sometimes true.  Nonetheless, the reasoning pattern in its proof (I claim) sometimes genuinely occurs at the level of intuition in real people, whether or not they know any math or Löb's Theorem.

Introduction

There are at least two real-world patterns that could reasonably be called Löbian cooperation in humans, which I'll name here:

1. Functionally Löbian cooperation.  Sometimes people become aware that they're anticipating (predicting) cooperation from each other, and then that anticipation causes them to cooperate, rendering the anticipation itself valid.  In this pattern, the fact that anticipation of cooperation will cause cooperation is analogous to the hypothesis (main assumption) of Löb's Theorem, and the fact that the cooperation in fact emerges is analogous to the conclusion of Löb's Theorem.  I call this pattern "functionally" Löbian because its input and output resemble the input (hypothesis) and output (conclusion) of Löb's Theorem.

2. Procedurally Löbian cooperation.  Sometimes the mental procedure a person follows to anticipate and decide upon cooperation can resemble an entire proof of Löb's Theorem, as I'll describe below.  In other words, instead of just the hypothesis and conclusion of Löb's Theorem matching reality, the structure in the intermediate steps of the proof also match reality, at least somewhat.   I call this "procedurally" Löbian cooperation, and it's a special case of functionally Löbian cooperation because it demands a stronger analogy between the theorem and the real world.  Illustrating how this might work constitutes is the bulk of content in this post.

What functionally Löbian cooperation feels like

For those who recognize the symbols involved, Löb's Theorem says that if  then .  I don't plan to use these symbols with their normal meanings in the rest of this post, so don't worry if you don't recognize them.

In words, functional Löbian cooperation happens when anticipation of future or unobserved cooperation causes present cooperation.  So if you're interacting with someone, and you feel like they're probably going to be nice to you in the future, and that fact makes you decide to be nice to them now, I call that functional Löbian cooperation.

What procedurally Löbian cooperation feels like

Most human cooperation is probably not procedurally Löbian, and maybe not even functionally Löbian.  However, I'm confident that human cooperation is sometimes procedurally Löbian, and I can even point to experiences of my own that fit the bill.  To explain this, I'll be talking a lot more about feelings, because I think most unconscious processing is carried out by and/or experienced as feelings.  I'll write

• Feeling("Pigs can probably fly.")

for the feeling that pigs can probably fly.  Such a feeling can be true or false, according to whether it correctly anticipates the real world.

In procedurally Löbian cooperation, part of the mental process will involve first feeling something uncertain to do with cooperation, then believing it, and then feeling like that belief "checks out".  To abbreviate this, when  denotes a feeling, I'll write  for the meta-feeling that " checks out".  The process of a feeling "checking out" might be different from person to person.  Just assume the person has some way of checking over a feeling  and then feeling like  has been verified or validated in some way (or not).  For each step of the procedurally Löbian cooperation process below, I'll speculate about an unconscious logical operation that feels pretty analogous to the feeling, for me.

Here goes...

1. Conscious experience:
I'm hanging out with a group, and I'm wondering if we're going to end up doing some cooperative thing together.  Let

= Feeling("We're all going to do the cooperative thing together.")

Here,  is a feeling with some content described by the string inside the Feeling(), and that content means that  can also be viewed as a true or false anticipation about the world.  I wonder if it's true.

2. Conscious experience
The group is starting to feel pretty wholesome to me now, and I have a feeling  that feels like it's about the group.  The feeling  itself feels kind of humble, sort of like it's leaving a lot of room for disagreement with it, by hypothesizing about its own validity rather than directly asserting it.  Roughly speaking, it feels like:

= Feeling("If this feeling () checks out, then we're all going to do the cooperative thing.")

I'm not confident that  is true, but I'm confident that I'm "hearing it correctly" when I introspect on it.  To be very clear about how it feels to be confident in my introspection but not confident in my assessment of the world, let me define:

Feeling(" says that if  checks out then ".)

I believe the feeling , while I fleetingly experience but do not believe the feeling . Make sense?

Corresponding unconscious belief (B):

" says that if  checks out then ."  Feel free to imagine an "=" sign here in place of the "" symbol for the purpose of this post, if that feels more intuitive.

Analogy to Löbian cooperation:
The feeling  corresponds to the modal fixed point from Löb's Lemma.

3. Conscious experience:
If I introspect on the feeling , it feels like "there's not much to it" in terms of content, other than the anticipation of cooperation.  So if the feeling checks out, so does the anticipation of cooperation.

Corresponding unconscious belief:

"If  checks out then  checks out."

Analogy to Löbian cooperation:
This is the forward conclusion of Löb's Lemma.

4. Conscious experience:
If it checks out to me that we're probably going to cooperate, I'll feel good about the other people in a way that makes me feel like I'm part of a team, which makes me want to do my part and cooperate.  I think the other people in this particularly wholesome-feeling group work the same way.  So if we can just get into a good vibe with each other where we expect cooperation, then we'll do it.

Corresponding unconscious belief:

"If  checks out, then ."

Analogy to Löbian cooperation:
This corresponds to the starting assumption of Löb's Theorem.

5. Conscious experience:
I'm starting to feel pretty optimistic about cooperation now, in that the feeling  is resonating with me a lot more.  Before it was just a tentative thing that was being considered, but now it feels true.

Corresponding unconscious processing:
3 and 4 combine to form , which is just

Analogy to Löbian cooperation:
This corresponds to lines 3 and 4 of this short proof of Löb's Theorem.

6. Conscious experience:
It really feels like my  feeling checks out now, which I guess means we're going to cooperate.  Cool!

Corresponding unconscious processing:
, which combines with  to make .

Analogy to Löbian cooperation:
This corresponds to lines 5 and 6 of this short proof of Löb's Theorem.

What do I take from this example?

As mentioned at the outset, because human intuition and reasoning and are neither perfectly logical nor perfectly self-referential, Löb's Theorem is not a theorem for humans: when its hypotheses are (roughly) me, its conclusion needn't be (even roughly) true.

Nonetheless, from examining instances like the example above I've concluded that Löbian cooperation is a more commonplace and arguably more mundane phenomenon than one might have otherwise guessed, especially from how technically involved the original proof of Löb's Theorem in Peano Arithmetic is.  When I'm hanging out with a group of friends or family who feel pretty wholesome to me (often people who are not very math-y), I've often experienced the emergence of normal everyday feelings of working-together or being-part-of-a-group that I now think, in retrospect, could be well described by the steps of a proof of Löb's Theorem like in the example above.  In general, I think my "System 1" (non-deliberate) processing of cooperation in a group setting could be considered

• functionally Löbian around 40% of the time, and
• procedurally Löbian around 10% of the time.

I'm not sure what to think the percentages are for other people on average; possibly more than for me.  I'm innately somewhat disagreeable, and procedurally Löbian cooperation feels like it engages my capacity for agreeableness or going-with-the-flow or something, such that I could easily imagine it happening for a larger fraction of other people's cooperative experiences than of mine.

Examples of non-Löbian cooperation

Not every emergence of cooperation is Löbian.  Below are two exemplars of categories of cooperation that I consider non-Löbian, at least viewed from the perspective of the mental processes of the individual people involved.  There may be more such categories.  What these examples share in common is that the cooperation is not based on anticipated or verified cooperation from the other person.

Non-example 1: altruism irrespective of cooperation from the counterparty

Alice sees Bob is hungry, and feels bad about that, so she gives him some food, expecting nothing in return.  Bob later sees Alice has dropped her wallet while walking in his neighborhood, and returns it to her without stealing anything.  Arguably, this is a cooperative interaction between Alice and Bob, but Alice's act of kindness was not based on any anticipation or implicit verification of cooperation from Bob.  So, at least for Alice I'd say this was not Löbian cooperation.

Non-example 2: dutiful cooperation irrespective of cooperation from the counterparty

Imagine two people who encounter each other in a setting where rules dictate that they're supposed to cooperate.  For instance, farm inspectors, Alice and Bob, who have been assigned to each other as partners.  Alice's job involves picking up Bob for inspections each morning.  She does this "because it's her job".  Even when Bob is cranky, or seems like he's otherwise being a jerk, she picks him up.  Perhaps even if she found out he's being investigated for embezzling money and isn't actually doing his part of the job correctly, she might still keep doing her job and picking him up for work every day.

Again, Alice's cooperation isn't based on an anticipation or implicit verification of cooperation from Bob, so it's not Löbian.  Arguably if Alice expected Bob to do something really bad immediately upon boarding her car, she wouldn't pick him up.  That would be more like an instance of Löbian defection — defection on the basis of anticipated defection — but still not quite, because the anticipation of Bob's defection doesn't come true.  Löb's Theorem is only a good fit for describing a situation when there's some kind of self-fulfilling prophesy involved.

Non-example 3: tit-for-tat in an iterated prisoner's dilemma

In an iterated prisoner's dilemma, in each round of the game two people write either C or D, for cooperate or defect.  "Tit-for-tat" is a strategy where you write C on the first round, and after that you just copy what your opponent wrote in the previous round.  When following a tit-for-tat strategy, what you know is that the opponent cooperated in the previous round, and you cooperate on the basis of that, not on the basis of cooperation in the present or future rounds.

If you really wanted to force an application of Löb's Theorem here, for Player 1 in round 2 you could define  to be the statement "The opponent cooperated last round, and I cooperate this round", but I don't think that's is very natural or worth delving into to explain here.

Ambiguously Löbian cooperation

Sometimes I think cooperation emerges in a way that's maybe-Löbian and maybe-not. Here's an example:

• I want to be a bit nicer to Bob than Bob is to me.  I currently sense that Bob is being 5/10 nice to me, so I start being 6/10 nice to Bob.  Then Bob starts being 7/10 nice to me, so I raise to 8/10, then him to 9/10, then me to 10/10, then him to 10/10.

Was this more like Löb, or more like tit-for-tat?  It started with a sense that Bob is "being nice" in a continual sense, which means to some degree my niceness was based on anticipated continued niceness from him (that's not conditioned on my behavior).  So you could argue that the start of this process was Löbian — being nice in response to anticipated-future-niceness — and maybe even involved a -like feeling that became more solid through a series of steps resembling the actual proof of Löb's Theorem.

On the other hand, my sense of Bob's niceness came from his past behavior, so you could argue that I'm just engaging a tit-for-tat strategy here.

In reality, I think both could be true: escalating niceness can be implemented by a Löb-like anticipation-of-niceness process, or a tit-for-tat reciprocation process, or both at the same time or in parallel.

Recap

I gave an example of a plausible-to-me cooperative emotional process that is roughly analogous to a proof of Löb's Theorem.  I dubbed such processes procedurally Löbian, and I refer to any cooperation based on anticipated future cooperation from a counterparty functionally Löbian cooperation.  I think functionally Löbian cooperation is probably fairly common among humans, though I don't know how common.  Speaking for myself, I think my cooperative instincts are functionally Löbian around 40% of the time, and procedurally Löbian around 10% of the time.

Thanks to Alex Zhu, Anjali Gopal, and Scott Garrabrant for discussions and feedback on the ideas in this post.

New Comment