Epistemic status: conjecture, needs critical feedback, but fits with a lot of established facts.

Over the last six years, I spent much of my spare time figuring out love. The main output of this process is a poem – I’m proud of it, but poems are weird, so here I write down my main result in prose.

There are forms of love that The Selfish Gene has already explained convincingly, like love for next of kin or love between mating and child-rearing partners, and reciprocal altruism. So those are not what I needed to figure out. They're important, because they provide existing physiology to repurpose. But they don’t explain why and how you can love your favorite artist, your fellow soldiers, your best friend, good colleagues, or Scott Alexander. And what about love towards non-sentient things such as your country, your favorite sports team, or mythologized characters such as Jesus? Is the use of the word “love” in these cases mere metaphor or confusion, or are they actually comparable phenomena on a biological level?

So this is about a mechanism of love that works independently of the ones more directly involved in reproduction such as sexual attraction, romantic infatuation or the joy of childrearing. The ontological question of whether it's all the same love in the end, or whether there are various types of love that go well together, doesn't strike me as productive. I should say I think that when you "love" a cake with whipped cream on it you're doing something different from the following.


My most salient confusion about love was why the hell showing vulnerability is so central to it. Maybe “vulnerability is attractive” seems like a normal idea to many, but it doesn't explain why that would be the case, and I found the idea deeply suspect for decades because it isn't how I grew up. But I had to concede that every time I tried it, it worked disconcertingly well. Every time I was vulnerable to someone, I liked them a lot better afterwards, and as far as I could tell they liked me better too. What? Why? Isn't love just about attraction, compatible personalities, shared projects and aligned values? None of those seem obviously enhanced by vulnerability. What’s going on? The following is how I answered this confusion.

When our brains evolved in groups of social apes in the savannah, we cooperated intensely. Many species do this, but unlike other species who do it on autopilot, we did it flexibly, dependent on the specific level of trust between the individuals involved.

Imagine a typical Tuesday on the African savannah a million years ago. You're one of a troop of four hungry humans noisily rushing a group of antelopes towards a cliff, where you know their superior speed won't help them escape you anymore. One of you spots a sabertooth prowling behind you. What is the optimal strategy here? You can chase away the sabertooth if you all shout and throw a lot of stones in its general direction, but then the antelopes will get away. You can keep chasing the antelopes, but then you run the risk the sabertooth will get close and savage your troop. What you should do is split up: one or two of you need to distract the sabertooth and signal to it that it has been spotted, while the rest of you need to get those antelopes. But this is hard to do. The cost in personal risk has to be distributed unevenly! Either team can fail at its task, and once you stop all watching each other, either team can defect and claim an honest mistake. The two tasks need different traits and skills, and you might not be in agreement who has got how much of what. But if you start a group discussion of who does which task, you won't get the antelopes either. So you need to arrive at a tactical solution promptly. The human way to do this takes trust between your troop members: whatever way your troop is organized, at some point some of you have to go along with someone else's quick decision.

Evolution builds us for our environment, and the environment of the individual that most strongly determines reproductive success is not the savannah, but the group. I agree with theorists like Robin Hanson and Kevin Simler who think an arms race developed: everyone required the trust of others, but sometimes abusing the trust of others was advantageous, so the abilities to conceal motives, and to infer them despite attempts at concealment, reinforced each other’s fitness value. After a couple of thousand generations, and millions of situations like the above, the evolutionary environment had changed. It is now a group of people who you need to trust you, and who are now more or less able to tell whether they should. How do you adapt to that? You need to be honestly trustworthy.

The simple way to be trustworthy is to be harmless and unable to deceive anybody. But that doesn't work because it sets you up for exploitation. So your trustworthiness needs to be selective – and that's what love does. You're trustworthy to someone who thinks that you love them. And you’re robustly trustworthy to someone who correctly knows that you genuinely love them.


Love is an answer, but it is a costly/risky one, because it constrains how you can act towards those you love and what those you love care about, and the distress of loved ones is contagious. So you can be exploited by those you love! Less costly/risky methods of becoming trustworthy should be evolutionarily favored, if they are comparably effective.

Deception is cheaper in many ways. And it works some of the time, which is how the whole arms race of trustworthiness signaling and judging trustworthiness got started. But deception gets riskier the more true information on the deceiver the to-be-deceived-person (deceivee?) has. As you get “seen through” more and more, there comes a point where your deception doesn't work reliably anymore. The person who you want to trust you holds too many pieces of information that might create a contradiction and expose the deception. So if you want the trust of someone who knows you very well, you have to genuinely have good intentions towards them, because they can tell whether you do. So that's where the love reaction happens.

Love as a method for trust seemed to me like a logical-enough evolutionary foundation. And if we're going to apply any rigor to this hypothesis, it necessarily comes with a question: how is this method operationalized and implemented? To this question, "showing vulnerability" is the answer. I put that in scare quotes because I think the term is imprecise and misleading.


The effect of the love reaction is not complicated to implement – it just reuses the circuitry that is already in place to bias behavior towards close family. (So when people speak of their best friends as “family”, or of loving someone “like a brother”, I think that is actually accurate.) The difficulty is in the trigger. Loving too little or too much is a suboptimal strategy that ends up incurring reproductive cost on average, so it is selected against. What ends up being selected for?

If love was an intelligently designed system, I think it should quantify the amount of true information shared with a person, especially personal and emotional stuff that is directly connected to motivational structures. Such people can already exploit you anyway, so the cost of exploitation risk is lowest, and they’re probably relevant, so their trust in you has the highest value.

What evolution actually built approximates this more or less:

  • It weighs personal topics that relate to motivations, things that make one’s future behavior predictable and therefore exploitable, much more highly than small talk.
  • It weighs information shared with intense emotionality more highly, which makes sense because strong emotions are hard to fake and should be expected to give good information.
  • It weighs embarrassing or shameful information more highly. And I suspect this might be at least partially operationalized by weighing more highly information shared with few people. This would usually be a sufficient approximation because usually, embarrassing things tend to be the ones shared with the fewest people. My suspicion comes from two types of exceptional cases:
    • When something shameful about me is known to lots of people, I don’t feel that this makes me love all of them more.
    • When I share something that’s not shameful, but also not widely known (yet) because it is new, I find myself loving more the first few people who know about it.
  • Clearly, people who have demonstrated interest in oneself are loved more readily. Which makes sense since such people are evidently spending more of their attention on oneself and will be building better models that allow them better insight. But this might be confounded with reciprocity or some other mechanism.
  • Dominant or prestigious people are also more lovable – perhaps because they're expected to be able to get lots of additional information from third persons? But dominance and prestige have lots of effects, so maybe this happens via romantic infatuation or sexual attraction, rather than via having intimate knowledge.

This allows, but does not require, reciprocity. If someone gives you important personal information, it implies they love you to some degree, which means you can trust them with your own important personal information, but it doesn’t mean you have to – unrequited love is also possible.


Both love and honesty are very big concepts, so if they are as tightly linked as I hypothesize, this link should have consequences in a lot of places. In other words, there are huge numbers of falsifiable predictions. I can't list them exhaustively, but here are some.

Arthur Aron’s “36 questions to fall in love” should survive independent replication. I don’t know if such replication has been done, because I can’t Google it among the many accounts of people who tried this informally. All of those I’ve seen say these questions work.

The most loving people should be the most honest, and the least loving people should be the most dishonest. I don't think this is controversial in general, but the implication here is that the correlation between these two traits (or between good quantifications thereof) should be very high.

Like all heuristics built by evolution, this one too can misfire in maladaptive ways. Modern technology allows one-way communication, where we can instinctively feel the presence of someone we're watching on a screen who doesn't even know we exist. Clearly we can end up loving such people, because they feel like they've been around a lot, and we may have been very emotional "in front of" them. And we can feel they have shared important personal things with us personally, although they were talking to a camera and didn’t know we exist, which can feel like they should be loving us too.

If someone knowing important personal things about us is a specific stimulus, it should have an associated superstimulus. And this one does! What is a superstimulus for this trigger? Obviously, it would be an omniscient being that knows absolutely everything about oneself. And people who believe in such characters do love them quite a lot!

This love reaction can be triggered not only by individuals who (we feel) know us, but also by groups and institutions who (we feel) know us. This is unsurprising because many of the adaptations we have for handling individuals, such as status and dominance, are also applied to groups. The brain instinctively treats groups and organizations as a kind of person (I like to call it an "interperson") and so it can also love them like it can love a person.

(Maybe part of the love that first generation immigrants tend to have for their new countries is because the immigration process gives the new country a lot of true information about them?)

This theory of love for non-kin describes it as an involuntary mechanism over which you don’t have direct conscious control, which matches common sense. But it also says you can consciously control it indirectly. Similarly to how you don’t have direct control over your salivary glands, but you do have control over whether you imagine seeing, touching, smelling and tasting a juicy lemon in great detail until your salivary glands do produce saliva. So if you want to love someone, tell them important personal stuff that makes your motivations transparent. And if you want to stop loving someone, keep important secrets from them.

(Now that you know this, you’re personally responsible for who you love – sorry!)

To clarify, I’m saying it is hard not to love someone who (you feel) knows you well, but I’m not saying you can only love someone who (you feel) knows you well. There are other triggers of love: the pursuit of sexual reproduction and the protection of other copies of one’s own genes are the obvious examples, but there likely are more. Maybe you just love someone because you were on MDMA when you first met, or something.

I think if this theory is true, it strengthens the case for moral realism – maybe someone better versed in meta-ethics can check?

I’m more confident in a much weaker claim: this seems relevant to the problem of AI alignment. AGI will be able to deceive us because it will be smarter than us, so as humans we want it to love us so that we can trust it, because that is our instinct. Our instinct might be right: if this is part of how love works evolutionarily, it is one way to build trust that has solid evidence of working. It might be true that a general intelligence that is thoroughly “seen through” should want to honestly love those who see through it, as long as it needs their trust. And this might extend to the next AGI generation, as long as the step change isn’t too big for the previous generation to see through the next. Maybe artificially intelligent systems that model each other can converge on a functionally homologous method for building trust between each other, which might in the best case scenario then also extend to us?


New Comment
3 comments, sorted by Click to highlight new comments since: Today at 5:16 PM

I think a big aspect of ingrained behaviors like love is about shaping the behaviors of the evolved agents to 'higher average utility behavior patterns'. For love of non-kin / non-mates, this is probably something like "this social organism will tend to do better when it chooses to remain in close social proximity to allies." So it's not just about gaining the trust and assistance of community members, it's also about making sacrifices to stay near enough to them that they could help you if you found yourself in need. Community building is about finding ways of credibly signalling contracts of mutual support, and building histories of repeated incidences of mutual support. Adversity can bring a group closer together by giving them the chance to demonstrate supportive behaviors for each other, increasing the expectation that such will happen again in the future.

In the ancestral environment, population densities were very low. My understanding is that almost everyone in your band would be at least somewhat related, or have an ambiguous degree of relatedness, and would be someone you'd rely on again and again. How often do we think interactions with true non-relatives actually happened? 


I'm not sure there's anything that needs to be explained here except "evolution didn't stumble upon a low-cost low-risk reliable way for humans to defect against non-relatives for personal advantage as much as a hypothetical intelligently designed organism could have." Is there?

Well of course there are no true non-relatives, even the sabertooth and antelopes are distant cousins. The question is how much you're willing to give up for how distant cousins. Here I think the mechanism I describe changes the calculus.

I don't think we know enough about the lifestyles of cultures/tribes in the ancestral environment, except we can be pretty sure they were extremely diverse. And all cultures we've ever found have some kind of incest taboo that promotes mating between members of different groups.