Epistemic Status: Confident

This idea is actually due to my husband, Andrew Rettek, but since he doesn’t blog, and I want to be able to refer to it later, I thought I’d write it up here.

In many games, such as Magic: The Gathering, Hearthstone, or Dungeons and Dragons, there’s a two-phase process. First, the player constructs a deck or character from a very large sample space of possibilities.  This is a particular combination of strengths and weaknesses and capabilities for action, which the player thinks can be successful against other decks/characters or at winning in the game universe.  The choice of deck or character often determines the strategies that deck or character can use in the second phase, which is actual gameplay.  In gameplay, the character (or deck) can only use the affordances that it’s been previously set up with.  This means that there are two separate places where a player needs to get things right: first, in designing a strong character/deck, and second, in executing the optimal strategies for that character/deck during gameplay.

(This is in contrast to games like chess or go, which are single-level; the capacities of black and white are set by the rules of the game, and the only problem is how to execute the optimal strategy. Obviously, even single-level games can already be complex!)

The idea is that human behavior works very much like a two-level game.

The “player” is the whole mind, choosing subconscious strategies.  The “elephant“, not the “rider.”  The player is very influenced by evolutionary pressure; it is built to direct behavior in ways that increases inclusive fitness.  The player directs what we perceive, do, think, and feel.

The player creates what we experience as “personality”, fairly early in life; it notices what strategies and skills work for us and invests in those at the expense of others.  It builds our “character sheet”, so to speak.

Note that even things that seem like “innate” talents, like the savant skills or hyperacute senses sometimes observed in autistic people, can be observed to be tightly linked to feedback loops in early childhood. In other words, savants practice the thing they like and are good at, and gain “superhuman” skill at it.  They “practice” along a faster and more hyperspecialized path than what we think of as a neurotypical “practicing hard,” but it’s still a learning process.  Savant skills are more rigidly fixed and seemingly “automatic” than non-savant skills, but they still change over time — e.g. Stephen Wiltshire, a savant artist who manifested an ability to draw hyper-accurate perspective drawings in early childhood, has changed and adapted his art style as he grew up, and even acquired new savant talents in music.  If even savant talents are subject to learning and incentives/rewards, certainly ordinary strengths, weaknesses, and personality types are likely to be “strategic” or “evolved” in this sense.

The player determines what we find rewarding or unrewarding.  The player determines what we notice and what we overlook; things come to our attention if it suits the player’s strategy, and not otherwise.  The player gives us emotions when it’s strategic to do so.  The player sets up our subconscious evaluations of what is good for us and bad for us, which we experience as “liking” or “disliking.”

The character is what executing the player’s strategies feels like from the inside.  If the player has decided that a task is unimportant, the character will experience “forgetting” to do it.  If the player has decided that alliance with someone will be in our interests, the character will experience “liking” that person.  Sometimes the player will notice and seize opportunities in a very strategic way that feels to the character like “being lucky” or “being in the right place at the right time.”

This is where confusion often sets in. People will often protest “but I did care about that thing, I just forgot” or “but I’m not that Machiavellian, I’m just doing what comes naturally.”  This is true, because when we talk about ourselves and our experiences, we’re speaking “in character”, as our character.  The strategy is not going on at a conscious level. In fact, I don’t believe we (characters) have direct access to the player; we can only infer what it’s doing, based on what patterns of behavior (or thought or emotion or perception) we observe in ourselves and others.

Evolutionary psychology refers to the player’s strategy, not the character’s. (It’s unclear which animals even have characters in the way we do; some animals’ behavior may all be “subconscious”.)  So when someone speaking in an evolutionary-psychology mode says that babies are manipulating their parents to not have more children, for instance, that obviously doesn’t mean that my baby is a cynically manipulative evil genius.  To him, it probably just feels like “I want to nurse at night. I miss Mama.”  It’s perfectly innocent. But of course, this has the effect that I can’t have more children until I wean him, and that’s to his interest (or, at least, it was in the ancestral environment when food was more scarce.)

Szaszian or evolutionary analysis of mental illness is absurd if you think of it as applying to the character — of course nobody wakes up in the morning and decides to have a mental illness. It’s not “strategic” in that sense. (If it were, we wouldn’t call it mental illness, we’d call it feigning.)  But at the player level, it can be fruitful to ask “what strategy could this behavior be serving the person?” or “what experiences could have made this behavior adaptive at one point in time?” or “what incentives are shaping this behavior?”  (And, of course, externally visible “behavior” isn’t the only thing the player produces: thoughts, feelings, and perceptions are also produced by the brain.)

It may make more sense to frame it as “what strategy is your brain executing?” rather than “what strategy are you executing?” since people generally identify as their characters, not their players.

Now, let’s talk morality.

Our intuitions about praise and blame are driven by moral sentiments. We have emotional responses of sympathy and antipathy, towards behavior of which we approve and disapprove. These are driven by the player, which creates incentives and strategic behavior patterns for our characters to play out in everyday life.  The character engages in coalition-building with other characters, forms and breaks alliances with other characters, honors and shames characters according to their behavior, signals to other characters, etc.

When we, speaking as our characters, say “that person is good” or “that person is bad”, we are making one move in an overall strategy that our players have created.  That strategy is the determination of when, in general, we will call things or people “good” or “bad”.

This is precisely what Nietzsche meant by “beyond good and evil.”  Our notions of “good” and “evil” are character-level notions, encoded by our players.

Imagine that somewhere in our brains, the player has drawn two cartoons, marked “hero” and “villain”, that we consult whenever we want to check whether to call another person “good” or “evil.” (That’s an oversimplification, of course, it’s just for illustrative purposes.)  Now, is the choice of cartoons itself good or evil?  Well, the character checks… “Ok, is it more like the hero cartoon or the villain cartoon?”  The answer is “ummmm….type error.”

The player is not like a hero or a villain. It is not like a person at all, in the usual (character-level) sense. Characters have feelings! Players don’t have feelings; they are beings of pure strategy that create feelings.  Characters can have virtues or vices! Players don’t; they create virtues or vices, strategically, when they build the “character sheet” of a character’s skills and motivations.  Characters can be evaluated according to moral standards; players set those moral standards.  Players, compared to we characters, are hyperintelligent Lovecraftian creatures that we cannot relate to socially.  They are beyond good and evil.

However! There is another, very different sense in which players can be evaluated as “moral agents”, even though our moral sentiments don’t apply to them.

We can observe what various game-theoretic strategies do and how they perform.  Some, like “tit for tat”, perform well on the whole.  Tit-for-tat-playing agents cooperate with each other. They can survive pretty well even if there are different kinds of agents in the population; and a population composed entirely of tit-for-tat-ers is stable and well-off.

While we can’t call cellular automata performing game strategies “good guys” or “bad guys” in a sentimental or socially-judgmental way (they’re not people), we can totally make objective claims about which strategies dominate others, or how strategies interact with one another. This is an empirical and theoretical field of science.

And there is a kind of “”morality”” which I almost hesitate to call morality because it isn’t very much like social-sentiment-morality at all, but which is very important, which says simply: the strategies that win in the long run are good, the ones that lose in the long run are bad.  Not “like the hero cartoon” or “like the villain cartoon”, but simply “win” and “lose.”

At this level you can say “look, objectively, people who set up their tables of values in this way, calling X good and Y evil, are gonna die.”  Or “this strategy is conducting a campaign of unsustainable exploitation, which will work well in the short run, but will flame out when it runs out of resources, and so it’s gonna die.”  Or “this strategy is going to lose to that strategy.”  Or “this strategy is fine in the best-case scenario, but it’s not robust to noise, and if there are any negative shocks to the system, it’s going to result in everybody dying.

“But what if a losing strategy is good?” Well, if you are in that value system, of course you’ll say it’s good.  Also, you will lose.

Mother Teresa is a saint, in the literal sense: she was canonized by the Roman Catholic Church. Also, she provided poor medical care for the sick and destitute — unsterilized needles, no pain relief, conditions in which tuberculosis could and did spread.  Was she a good person? It depends on your value system, and, obviously, according to some value systems she was.  But, it seems, that a population that places Mother Teresa as its ideal (relative to, say, Florence Nightingale) will be a population with more deaths from illness, not fewer, and more pain, not less.  A strategy that says “showing care for the dying is better than promoting health” will lose to one that actually can reward actions that promote health.  That’s the “player-level” analysis of the situation.

Some game-theoretic strategies (what Nietzsche would call “tables of values”) are more survival-promoting than others.  That’s the sense in which you can get from “is” to “ought.”  The Golden Rule (Hillel’s, Jesus’s, Confucius’s, etc) is a “law” of game theory, in the sense that it is a universal, abstract fact, which even a Lovecraftian alien intelligence would recognize, that it’s an effective strategy, which is why it keeps being rediscovered around the world.

But you can’t adjudicate between character strategies just by being a character playing your strategy.  For instance, a Democrat usually can’t convert a Republican just by being a Democrat at him. To change a player’s strategy is more like “getting the bodymind to change its fundamental assessments of what is in its best interests.”  Which can happen, and can happen deliberately and with the guidance of the intellect! But not without some…what you might call, wiggling things around.

The way I think the intellect plays into “metaprogramming” the player is indirect; you can infer what the player is doing, do some formal analysis about how that will play out, comprehend (again at the “merely” intellectual level) if there’s an error or something that’s no longer relevant/adaptive, plug that new understanding into some change that the intellect can affect (maybe “let’s try this experiment”), and maybe somewhere down the chain of causality the “player”‘s strategy changes. (Exposure therapy is a simple example, probably much simpler than most: add some experiences of the thing not being dangerous and the player determines it really isn’t dangerous and stops generating fear emotions.)

You don’t get changes in player strategies just by executing social praise/blame algorithms though; those algorithms are for interacting with other characters.  Metaprogramming is… I want to say “cold” or “nonjudgmental” or “asocial” but none of those words are quite right, because they describe character traits or personalities or mental states and it’s not a character-level thing at all.  It’s a thing Lovecraftian intelligences can do to themselves, in their peculiar tentacled way.

 

New to LessWrong?

New Comment
27 comments, sorted by Click to highlight new comments since: Today at 7:21 AM

I'm a bit torn here, because the ideas in the post seem really important/useful to me (e.g., I use these phrases as a mental pointer sometimes), such that I'd want anyone trying to make sense of the human situation to have access to them (via this post or a number of other attempts at articulating much the same, e.g. "Elephant and the Brain"). And at the same time I think there's some crucial misunderstanding in it that is dangerous and that I can't articulate. Voting for it anyhow though.

What seems off to me is the idea that the 'player' is some sort of super-powerful incomprehensible lovecraftian optimizer. I think it's more apt to think of it as like a monkey, but a monkey which happens to share your body and have write access to the deepest patterns of your thought and feeling(see Steven Byrnes' posts for the best existing articulation of this view). It's just a monkey, its desires aren't totally alien and I think it's quite possible for one's conscious mind to develop a reasonably good idea of what it wants. That the OP prefers to push the 'alien/lovecraftian' framing is interesting and perhaps indicates that they find what their monkey (and/or other peoples' monkeys) wants repulsive in some way.

I'm guessing your concern feels similar to ones you've articulated in the past around... "heart"/"grounded" rationality, or a concern about "disabling pieces of the epistemic immune system". 

I'm curious if 8 mo's later you feel you can better speak to what you see as the crucial misunderstanding?

I voted very hard for this post. The idea feels correct, though I'd describe it as pointing at a key unresolved confusion/conflict for me. It fuels this quiet voice of doubt about everything I do my life (and about others in theirs). I'm not entirely sure what do with this model though, like, the entailment is missing or something. I voted hard mostly because I see it as the start of an issue to be resolved, not a finished work.

I'm not sure if the lack of "solution/response" or possibility of bad solution/responses is what you think is dangerous, or perhaps something in the very framing itself (if so, I'm not seeing it).

I should probably give the whole topic bit more thought rather than looping on my feelings of "stuck" around it.

people generally identify as their characters, not their players.

I prefer to identify with my whole brain. I suspect that reduces my internal conflicts.

I don't think the "player" is restricted to the brain. A lot of the computation is evolutionary. I think it may be reasonable to view some of the computation as social and economic as well.

It’s a cute metaphor; and for anyone versed in RPG lore, it is (it seems to me) likely to be helpful, descriptively, in conceptualizing the facts of the matter (the evolutionary origins of morality, etc.).

But the substantive conclusions in this post are unsupported (and, I think, unsupportable). Namely:

Some game-theoretic strategies (what Nietzsche would call “tables of values”) are more survival-promoting than others. That’s the sense in which you can get from “is” to “ought.”

To the contrary, this does not get you one iota closer to “ought”.

Sure, some strategies are more survival-promoting. But does that make them morally right? Are you identifying “right” with “survival-promoting”, or even claiming that “right”, as a concept, must contain “survival-promoting”? But that’s an “ought” claim, and without making such a claim, you cannot get to “it is right to execute this strategy” from “this strategy is survival-promoting”.

(Now, you might say that acting on any moral view other than “what is survival-promoting is right” will make you fail to survive, and then your views on morality will become irrelevant. This may be true! But does that make those other moral views wrong? No, unless you, once again, adopt an “ought” claim like “moral views which lead to failure to survive are wrong”, etc. In short, the is-ought gap is not so easily bridged.)

The way I think the intellect plays into “metaprogramming” the player is indirect; you can infer what the player is doing, do some formal analysis about how that will play out, comprehend (again at the “merely” intellectual level) if there’s an error or something that’s no longer relevant/adaptive, plug that new understanding into some change that the intellect can affect (maybe “let’s try this experiment”), and maybe somewhere down the chain of causality the “player”’s strategy changes.

Any “character” who does such a thing is, ultimately, still executing the strategy selected by the “player”. “Characters” cannot go meta. (“Character” actions can end up altering the population of “players”—though this is not quite yet within our power. But in such a case, it is still the “players” that end up selecting strategies.)

I strongly agree with these comments regarding is-ought. To add a little, talking about winning/losing, effective strategies or game theory assumes a specific utility function. To say Maria Teresa "lost" we need to first agree that death and pain are bad. And even the concept of "survival" is not really well-defined. What does it mean to survive? If humanity is replaced by "descendants" which are completely alien or even monstrous from our point of view, did humanity "survive"? Surviving means little without thriving and both concepts are subjective and require already having some kind of value system to specify.

If humanity is replaced by "descendants" which are completely alien or even monstrous from our point of view, did humanity "survive"?

Og see 21st century. Og say, "Where is caveman?"

3-year-old you sees present-day you...

Present you sees 90-year-old you...

90-year-old you sees your 300-year-old great great grandchildren...

“After, therefore the fulfillment of.” Is this your argument, or is there something more implied that I’m not seeing?

As it is, this seems to Prove Too Much.

I'm raising a question more than making an argument. Are there futures that would seem to present-day people completely alien or even monstrous, that nevertheless its inhabitants would consider a vast improvement over our present, their past? Would these hypothetical descendants regard as mere paperclipping, an ambition to fill the universe forever with nothing more than people comfortably like us?

"Of Life only is there no end; and though of its million starry mansions many are empty and many still unbuilt, and though its vast domain is as yet unbearably desert, my seed shall one day fill it and master its matter to its uttermost confines. And for what may be beyond, the eyesight of Lilith is too short. It is enough that there is a beyond."

To the con­trary, this does not get you one iota closer to “ought”.

This is true, but I do think there's something being pointed at that deserves acknowledging.

I think I'd describe it as: you don't get an ought, but you do get to predict what oughts are likely to be acknowledged. (In future/in other parts of the world/from behind a veil of ignorance.)

That is, an agent who commits suicide is unlikely to propagate; so agents who hold suicide as an ought are unlikely to propagate; so you don't expect to see many agents with suicide as an ought.

And agents with cooperative tendencies do tend to propagate (among other agents with cooperative tendencies); so agents who hold cooperation as an ought tend to propagate (among...); so you expect to see agents who hold cooperation as an ought (but only in groups).

And for someone who acknowledges suicide as an ought, this can't convince them not to; and for someone who doesn't acknowledge cooperation, it doesn't convince them to. So I wouldn't describe it as "getting an ought from an is". But I'd say you're at least getting something of the same type as an ought?

First of all, there isn’t anything that’s “of the the same type as an ought” except an ought. So no, you’re not getting any oughts, nor anything “of the same type”. It’s “is” all the way through, here.

More to the point, I think you’re missing a critical layer of abstraction/indirection: namely, that what you can predict, via the adaptive/game-theoretic perspective, isn’t “what oughts are likely to be acknowledged”, but “what oughts will the agent act as if it follows”. Those will usually not be the same as what oughts the agent acknowledges, or finds persuasive, etc.

This is related to “Adaptation-Executers, Not Fitness-Maximizers”. An agent who commits suicide is unlikely (though not entirely unable!) to propagate, this is true, but who says that an agent who doesn’t commit suicide can’t believe that suicide is good, can’t advocate for suicide, etc.? In fact, such agents—actual people, alive today—can, and do, all these things!

"Epistemic Status: Confident"?

That's surprising to me.

I skipped past that before reading, and read it as fun, loose speculation. I liked it, as that.

But I wouldn't have thought it deserves "confident".

I'm not sure if I should give it less credence or more, now.

I'm confused.

One thing which I find interesting about many 2-system models, including this one, is that the "lower" system (the subconscious, the elephant, system 1, etc) is often not doing its calculations entirely or even primarily in the brain (though this is only rarely clarified). The original system 1 / system 2 distinction was certainly referring to brain structures -- "hot" and "cool" subsystems of the brain. But, in terms of Freud's earlier 2-system model, the conscious vs unconscious, Carl Jung found it useful to speak of the "collective unconsciousness" as being an element of the unconscious mind. I think Jung's idea is actually a good way of cutting things up.

It's very obvious in the example in this post with the baby nursing: there doesn't need to be a calculation anywhere in the baby which figures out that wanting mama in the evening reduces the chances of more siblings. There probably isn't a calculation like that.

So, in many cases, the "player" is indeed lovecraftian and inhuman: it is Azathoth, the blind watchmaker. Evolution selects the genes which shape the personality type.

Obviously, not all of the "player" computations you're referring to occur at the evolutionary level. But, I think the boundary is a fluid one. It is not always easy to cleanly define whether an adaptation is evolutionary or within-lifetime; many things are a complicated combination of both (see the discussion of the Baldwin effect in The Plausibility of Life.)

I think there are other lovecraftian gods holding some of the strings as well. Many habits and norms are shaped by economic incentives (Mammon, god of the market place). This is a case where more of the computation may be in a person's head, but, not all of it. The market itself does a lot of computation, and people can pick up machiavellian business norms without having a generator of machiavellianness inside their skull, or blindly ape personality-ish things contributing to reasonable spending habits without directly calculating such things, etc.

We can explain the words of politicians better by thinking they're optimized for political advantage rather than truth, and much of that optimization may be in the brain of the politician. But, the political machine also can select for politicians who honestly believe the politically-advantageous things. In an elephant/rider model, the computation of the elephant may be outside the politician.

I didn't feel like I fully understood this post at the time when it was written, but in retrospect it feels like it's talking about essentially the same thing as Coherence Therapy does, just framed differently.

Any given symptom is coherently produced, in other words, by either (1) how the individual strives, without conscious awareness, to carry out strategies for safety or well-being; or (2) how the individual responds to having suffered violations of safety or well-being. This model of symptom production is squarely in accord with the constructivist view of the self as having profound if unrecognized agency in shaping experience and behavior. Coherence therapy is centrally focused on ushering clients into a direct, noninterpretive experience of their agency in generating the symptom.

Symptom coherence was also defined by Ecker and Hulley (2004) as a heuristic principle of mental functioning, as follows: The brain-mind-body system can purposefully produce any of its possible conditions or states, including any kind of clinical symptom, in order to carry out any purpose that it is capable of forming.

This principle of general coherence is, of course, quite foreign to the therapy field’s prevailing, pathologizing models of symptom production. Underscoring the paradigmatic difference, Ecker and Hulley (2004, p. 3), addressing trainees, comment:

You won’t fully grasp this methodology until you grasp the nimble, active genius of the psyche not only in constructing personal reality, but also in purposefully manifesting any one of its myriad possible states to carry out any of its myriad possible purposes. The client’s psyche is always coherent, always in control of producing the symptom—knowing why and when to produce it and when not to produce it.

-- Toomey & Ecker 2007 (sci-hub)

It's nicely written, but the image of the Player, hyperintelligent Lovecraftian creature, seems not really right to me. In my picture, were you have this powerful agent entity, I see a mess of sub-agents, interacting in a game theoretical way primarily among themselves.* How "smart" the results of the interactions are, is quite high variance. Obviously the system has a lot of computing power, but that is not really the same as being intelligent or agent-like.

What I really like the descriptions how the results of these interaction are processed via some "personality generating" layers and how the result looks like "from within".

(* one reason for why this should be the case is: there is not enough bandwidth between DNA and the neural network; evolution can input some sort of a signal like "there should be a subsystem tracking social status, and that variable should be maximized" or tune some parameters, but it likely does not have enough bandwidth to transfer some complex representation of the real evolutionary fitness. Hence what gets created are sub-agenty parts, which do not have direct access to reality, and often, instead of playing some masterful strategy in unison, are bargaining or even defecting internally)

I think hyperintelligent lovecraftian creature is the right picture. I don't think the player is best located in the brain.

there is not enough bandwidth between DNA and the neural network; evolution can input some sort of a signal like "there should be a subsystem tracking social status, and that variable should be maximized" or tune some parameters, but it likely does not have enough bandwidth to transfer some complex representation of the real evolutionary fitness.

I think you would agree that evolution has enough bandwidth to transmit complex strategies. The mess of sub-agents is, I think, more like the character in the analogy. There are some "player" calculations done in the brain itself, but many occur at the evolutionary level.

I like your point about where most of the computation/lovecraftian monsters are located.

I'll think about it more, but if I try to paraphrase it in my picture by a metaphor ... we can imagine an organization with a workplace safety department. The safety regulations it is implementing are result of some large external computation. Also even the existence of the workplace safety department is in some sense result of the external system. But drawing boundaries is tricky.

I'm curious about how the communication channel between evolution and the brain looks like "on the link level". It seems it is reasonably easy to select e.g. personality traits, some "hyperparameters" of the cognitive architecture, and similar. It is unclear to me if this can be enough to "select from complex strategies" or if it is necessary to transmit strategies in some more explicit form.

I like this framing, but I generally think of there being more levels, and a lot of backward-information among levels (outer levels influencing inner levels). Between execution and deckbuilding, there's a choice of what game to play now. Before any of it, card acquisition, based on an earlier level of deciding to get into this game at all. And so on, up the abstraction and decision tree. Each level is influenced by earlier levels, and makes predictions about later level: you acquire cards and build your deck based on experiences in the execution of games.

This prediction and effect _does_ mean that object-level actions can cause impact on meta-level strategies. In fact, execution-level behaviors (social praise/blame) is the PRIMARY way to get others to change their meta-strategies.

It also seems worth pointing out that the referent of the metaphor indeed has more than two levels. For example, we can try to break it down as genetic evolution -> memetic evolution -> unconscious mind -> conscious mind. Each level is a "character" to the "player" of the previous level. Or, in computer science terms, we have a program writing a program writing a program writing a program.

John Nerst also had a post Facing the Elephant, which had a nice image illustrating our strategic calculation happening outside the conscious self:

The ideas in this post feel similar to those of Hanson, Simler, and others, but I still found there something really crisp about it. Since reading it, I've mentioned this framing to others and used it internally repeatedly. The ideas here easily push towards something like cyncism, but they just seem so correct.

I found this post valuable at the time, and read it again as research for writing fiction about rationality, finding it pretty relevant.

I haven't quite observed this; even extremely broad patterns of behavior frequently seem to deviate from any effective strategy (where said strategy is built around a reasonable utility function). In the other direction, how would this model be falsified? Retrospective validation might be available (though I personally can't find it), but anticipation using this dichotomy seems ambiguous.

You can get into some weird, loopy situations when people reflect enough to lift up the floorboards, infer some "player-level" motivations, and then go around talking or thinking about them at the "character level". Especially if they're lacking in tact or social sophistication. I remember as a kid being so confused about charitable giving -- because, doesn't everyone know that giving is basically just a way of trying to make yourself look good? And doesn't everyone know that that's Wrong? So shouldn't everyone just be doing charity anonymously or something?

Luckily, complex societies develop ways for handling different, potentially contradictory levels of meaning with grace and tact; and nobody listens too much to overly sincere children.

I agree on the "is to ought" stuff in a strategic sense.

I'm curious what the "Lovecraftian" model predicts. (As opposed to just being "Do not anthropomorphize" sign.)

it is a universal, abstract fact, which even a Lovecraftian alien intelligence would recognize

I disagree with this because I think we've only observed it locally under specific conditions, and I think this is just a nice sounding argument. It being a fact about Earth (and humans) I consider a reasonable claim.