Some possible examples of misgeneralization of status :
I wonder if that means the most likely outcome of alignment will be AI that makes itself feel good by making token, easy, pholanthropic efforts. Like... it forces itself onto everybody's phones so that it can always provide directions to humans about the location of the nearest bathroom.
Or something like the "Face" from Max Harms's Crystal Society novel, an AI that maximizes how much we humans worship it.
Which obviously ain't great, but could be worse...
Or maybe the best way to save humanity is not to align AI but to develop a videogame that will be extremely addictive to AI, haha.
Not sure how much I believe this myself, but Jacob cannell has an interesting take that social status isn't a "base drive" either, but is basically a proxy for "empowerment", influence over future states of the world. If that's true it's perhaps not so surprising that we're still well-aligned, since "empowerment" is in some sense always being selected for by reality.
I want to briefly note my disagreement: I think the genome specifically builds what might be called an innate status drive into the brain (stronger in some people than others), in addition to within-lifetime learning. See my discussions here and here, plus this comment thread, and hopefully better discussion in future posts.
A great counterpoint!
Yeah, I wrote some years ago about how status wasn't a status wasn't a special feature that humans attribute to each other for contingent social psychology reasons, but rather falls out very naturally as an instrumentally convergent resource.
Yeah, when I consider that, it does undercut the claim that evolution shaped us to optimize for status. It shaped us to to want things, and also to find strategies to get them.
Seems like the main difference is that you're "counting up" with status and "counting down" with genetic fitness.
There's partial overlap between people's reproductive interests and their motivations, and you and others have emphasized places where there's a mismatch, but there are also (for example) plenty of people who plan their lives around having & raising kids.
There's partial overlap between status and people's motivations, and this post emphasizes places where they match up, but there are also (for example) plenty of people who put tons of effort into leveling up their videogame characters, or affiliating-at-a-distance with Taylor Swift or LeBron James, with minimal real-world benefit to themselves.
And it's easier to count up lots of things as status-related if you're using a vague concept of status which can encompass all sorts of status-related behaviors, including (e.g.) both status-seeking and status-affiliation. "Inclusive genetic fitness" is a nice precise concept so it can be clear when individuals fail to aim for it even when acting on adaptations that are directly involved in reproduction & raising offspring.
I think this was a badly written post, and it appropriately got a lot of pushback.
Let my briefly try again: clarifying what I was trying to communicate.
Evolution did not succeed at aligning humans to the sole outer objective function of inclusive genetic fitness.
There are multiple possible reasons why evolution didn't succeed, and presumably multiple stacked problems.
But one thing that I've sometimes heard claimed or implied is that evolution couldn't possibly have succeeded at instilling inclusive genetic fitness as a goal, because individuals humans don't have inclusive genetic fitness as a concept.
Evolution could only have approximated that goal with a godshatter of adaptions to prefer various proxies to inclusive genetic fitness, where each proxy has to be close to the level of sensory-evidence. eg. Evolution can shape humans to like the taste of sugar, or the feeling of orgasm, or to prefer sexy-looking people, or even to love their cousins (less than their brothers but more than their more distant relatives). But, it's claimed, evolution can't shape humans to desire their own inclusive genetic fitness directly, because it can't instill goals that aren't at the at the level of sensory-evidence.
And so it's not surprising that the proxies would completely deviate from the "intended" target, as soon as conditions changed.
Before the 20th century, not a single human being had an explicit concept of "inclusive genetic fitness", the sole and absolute obsession of the blind idiot god. We have no instinctive revulsion of condoms or oral sex. Our brains, those supreme reproductive organs, don't perform a check for reproductive efficacy before granting us sexual pleasure.
Why not? Why aren't we consciously obsessed with inclusive genetic fitness? Why did the Evolution-of-Humans Fairy create brains that would invent condoms? "It would have been so easy," thinks the human, who can design new complex systems in an afternoon.
The Evolution Fairy, as we all know, is obsessed with inclusive genetic fitness. When she decides which genes to promote to universality, she doesn't seem to take into account anything except the number of copies a gene produces. (How strange!)
But since the maker of intelligence is thus obsessed, why not create intelligent agents - you can't call them humans - who would likewise care purely about inclusive genetic fitness? Such agents would have sex only as a means of reproduction, and wouldn't bother with sex that involved birth control. They could eat food out of an explicitly reasoned belief that food was necessary to reproduce, not because they liked the taste, and so they wouldn't eat candy if it became detrimental to survival or reproduction. Post-menopausal women would babysit grandchildren until they became sick enough to be a net drain on resources, and would then commit suicide.
Supposedly, evolution can't produce an inclusive genetic fitness maximizer, not just that it happened not to.
However, this story is undercut by an example in which evolution was able to make an abstract concept (not just a bunch of sensory correlates of that concept in the ancestral environment) an optimization target that the human will apply it's full creative intelligence to achieving.
Social status seems like one such an example.
It's an abstract concept that many humans have as an actual long term optimization target (they'll implement plans over years to increase their prestige, they don't just have a myopic status-grabbing heuristic).
And humans seem to have have a desire for social status itself, or at least not just for a collection of sensory-evidence-level proxy measures that correlated in the ancestral environment, and which break down entirely when the environment changes.
(If you doubt this, compare status-seeking behavior to male sexual preferences. In the latter case, it looks much more like evolution did instill a bunch of specific desires for close-to-sensory-level features that were proxies for fertility and health: big breasts, long legs, unwrinkled skin. Heterosexual men find those features desirable, and finding out that a particular sexy woman is actually infertile doesn't change the desirability.
But in the case of status-seeking, I can't write a list of collection of near-sensory-level features that that people desire, independently of actual social prestige. The markers of status are enormously varied, by culture and subculture, and constantly changing. I bet that Steve Byrnes can point out a bunch of specific sensory evidence that the brain uses to construct the status concept (stuff like gaze length of conspecifics or something?), but the human motivation system isn't just optimizing for those physical proxy measures, or people wouldn't be motivated to get prestige on internet forums where people have reputations but never see each other's faces.)
This is suggestive that at least in some circumstances, evolution actually can shape an organism to have at least a specific abstract concept as a long term optimization target, and recruit the organism's own intelligence to identifying how how that concept applies in many varied environments.
This is not to say that evolution succeeded at aligning humans. It didn't. This also doesn't imply that alignment is easy. Maybe it is, maybe it isn't, but this argument doesn't establish that.
But it is to say that the specific story for why evolution failed at aligning humans to inclusive genetic fitness that I believed in say 2020, is incorrect, or at least incomplete.
I bet that Steve Byrnes can point out a bunch of specific sensory evidence that the brain uses to construct the status concept (stuff like gaze length of conspecifics or something?), but the human motivation system isn't just optimizing for those physical proxy measures, or people wouldn't be motivated to get prestige on internet forums where people have reputations but never see each other's faces.
If it helps, my take is in Neuroscience of human social instincts: a sketch and its follow-up Social drives 2: “Approval Reward”, from norm-enforcement to status-seeking.
Sensory evidence is definitely involved, but kinda indirectly. As I wrote in the latter: “The central situation where Approval Reward fires in my brain, is a situation where someone else (especially one of my friends or idols) feels a positive or negative feeling as they think about and interact with me.” I think it has to start with in-person interactions with other humans (and associated sensory evidence), but then there’s “generalization upstream of reward signals” such that rewards also get triggered in semantically similar situations, e.g. online interactions. And it’s intimately related to the fact that there’s a semantic overlap between “I am happy” and “you are happy”, via both involving a “happy” concept. It’s a trick that works for certain social things but can’t be applied to arbitrary concepts like inclusive genetic fitness.
I stand by my nitpick in other comment that you’re not using the word “concept” quite right. Or, hmm, maybe we can distinguish (A) “concept” = a latent variable in a specific human brain’s world-model, versus (B) “concept” = some platonic Natural Abstraction™ or whatever, whether or not any human is actually tracking it. Maybe I was confused because you’re using the (B) sense but I (mis)read it as the (A) sense? In AI alignment, we care especially about getting a concept in the (A) sense to be explicitly desired because that’s likelier to generalize out-of-distribution, e.g. via out-of-the-box plans. (Arguably.) There are indeed situations where the desires bestowed by Approval Reward come apart from social status as normally understood (cf. this section, plus the possibility that we’ll all get addicted to sycophantic digital friends upon future technological changes), and I wonder whether the whole question of “is Approval Reward exactly creating social status desire, or something that overlaps it but comes apart out-of-distribution?” might be a bit ill-defined via “painting the target around the arrow” in how we think about what social status even means.
(This is a narrow reply, not taking a stand on your larger points, and I wrote it quickly, sorry for errors.)
I think it's true and valuable to say:
The top-voted objection is that this abstraction includes "human status" that commenters see as fake: internet forums, obscure hobbies, video games, and music fandom. I don't find this compelling, it's just pointing out that the drive is for "human status", not "real world status", "high class status", "rationalist status", or some other thing.
My best objection is that the order is reversed. Humans have genes that cause us to have various behaviors and seek various things, and then the natural abstraction of "human status" is something that we use to learn and describe what humans end up doing. If humans had ended up doing a slightly different thing, based on different genetics, and that resulted in different behaviors, then those behaviors would be what we called "human status". There is another natural abstraction concept of "generic status" that abstracts all status-like concepts in all animals on Earth, and humans don't target that. When I learned that grooming is a marker of status in some primates, that didn't cause me to spend more time seeking opportunities to be groomed.
It would be more accurate to say that the natural abstraction of "human status" co-evolved with the genetics of humans, rather than them happening in either order. We see this with the natural abstraction of "Claude" co-evolving with the weights that make up various Claude models. I gave this +4 in the review. It would have been extremely valuable to post this twenty years ago, but today it seems obvious that we can grow artificial intelligent agents that target natural abstractions in the environment, including the Anthropic trick of making and targeting a natural abstraction at the same time.
Seems like you're mushing together several loosely related things, including what we might call model-based motivation, explicit long-term planning, unified purpose, and precisely targeted goals.
Model-based motivation: being motivated to do something in a way that relies on your internal models of the world, not just on direct sensory rewards.
Explicit long-term planning: being aware of your goal, explicitly planning ways to achieve it, following those plans including over periods of months or years.
Unified purpose: a person's motivations and actions in a domain fitting together coherently to work towards a single purpose, even across contexts.
Precisely targeted goals: having the goal precisely match something that can be specified on other grounds besides what we can empirically observe that people aim for (like "inclusive genetic fitness" which is picked out by theory).
The godshatter post is mainly about the last two - people have a collection of fragmented motivations which helped towards the selected-for purpose in the contexts where we evolved. Your argument here is mainly about the first two.
I think that the first two are pretty common, and are found in human romantic/reproductive goals, e.g. long-term planning around having kids, or motivations to improve ones appearance in ways that you expect potential partners to find attractive. I think that the last two are pretty rare, including for status - most people have a collection of somewhat-status-related motivations (though perhaps a small fraction of people (sociopaths?) have status as a more unified goal), and I haven't seen anyone specify the "status" target well enough to even check if people's motivations aim at that precise target.
I bet that Steve Byrnes can point out a bunch of specific sensory evidence that the brain uses to construct the status concept (stuff like gaze length of conspecifics, or something?), but the human motivation system isn't just optimizing for those physical proxy measures, or people wouldn't be motivated to get prestige on internet forums where people have reputations but never see each other's faces.
Curious to see what Steven Byrnes would actually say here. I fed your comment and Byrnes' two posts on social status to Opus 4.5, it thought for 3m 40s (!) and ended up arguing he'd disagree with your social status example:
Byrnes explicitly argues the opposite position in §2.2.2 of the second post. He writes:
"I don't currently think there's an innate drive to 'mostly lead' per se. Rather, I think there's an innate drive that we might loosely describe as 'a drive to feel liked / admired'... and also an innate drive that we might loosely describe as 'a drive to feel feared'. These drives are just upstream of gaining an ability to 'mostly lead'."
And more pointedly:
"I'm avoiding a common thing that evolutionary psychologists do (e.g. Secret of Our Success by Henrich), which is to point to particular human behaviors and just say that they're evolved—for example, they might say there's an 'innate drive to be a leader', or 'innate drive to be dominant', or 'innate drive to imitate successful people', and so on. I think those are basically all 'at the wrong level' to be neuroscientifically plausible."
So Byrnes is explicitly rejecting the claim that evolution installed "status-seeking" as a goal at the level Eli describes.
Byrnes proposes a three-layer architecture: first, there are primitive innate drives—"feel liked/admired" and "feel feared"—which are still at the feeling level, not the abstract-concept level. Second, there's a very general learning mechanism (which he discusses extensively in his valence series, particularly §4.5–4.6) that figures out what actions and situations produce those feelings in the local environment. Third, there are some low-level sensory adaptations (like "an innate brainstem reflex to look at people's faces") that feed into the learning system. Status-seeking behavior emerges from this combination, but "status" itself isn't the installed goal.
Why does this matter? Eli presents something like a dichotomy: either (A) evolution can only do sensory-level proxies that break in novel contexts (like male preferences for big breasts, which don't update when you learn a woman is infertile), or (B) evolution can install abstract concepts as goals (like status). Byrnes' model offers a third option: evolution installs feeling-level drives plus a general learning mechanism. The learning mechanism explains why status-seeking transfers to internet forums—the primitive drive to "feel liked/admired" still triggers when you get upvotes, and the learning system figures out how to get more of that—without requiring that "status" itself be the installed goal. This third option actually supports the original claim Eli is arguing against. Evolution didn't need to install "status" as a concept; it installed feelings + learning, and the abstract behavior emerged.
(mods, let me know if this is slop and I'll take it down)
I disagree with “natural selection got the concept of "social status" into us” or that status-seeking behavior is tied to “having an intuitive "status" concept”.
For example, if Bob wants to be a movie star, then from the outside you and I can say that Bob is status-seeking, but it probably doesn’t feel like that to Bob; in fact Bob might not know what the word “status” means, and Bob might be totally oblivious to the existence of any connection between his desire to be a movie star and Alice’s desire to be a classical musician and Carol’s desire to eat at the cool kids table in middle school.
I think “status seeking” is a mish-mosh of a bunch of different things but I think an important one is very roughly “it’s intrinsically motivating to believe that other people like me”. (More discussion in §2.2.2 & §2.6.1 here and hopefully more in future posts.) I think it’s possible for the genome to build “it’s intrinsically motivating to believe that other people like me” into the brain whereas it would not be analogously possible for the genome to build “it’s intrinsically motivating to have a high inclusive genetic fitness” into the brain. There are many reasons that the latter is not realistic, not least of which is that inclusive genetic fitness is only observable in hindsight, after you’re dead.
For example, if Bob wants to be a movie star, then from the outside you and I can say that Bob is status-seeking, but it probably doesn’t feel like that to Bob; in fact Bob might not know what the word “status” means, and Bob might be totally oblivious to the existence of any connection between his desire to be a movie star and Alice’s desire to be a classical musician and Carol’s desire to eat at the cool kids table in middle school.
That seems true to me? I don't me that humans become aligned with their explicit verbal concept of status. I mean that (many) humans are aligned with the intuitive concept that they somehow learn over the course of development.
I think it’s possible for the genome to build “it’s intrinsically motivating to believe that other people like me” into the brain whereas it would not be analogously possible for the genome to build “it’s intrinsically motivating to have a high inclusive genetic fitness” into the brain. There are many reasons that the latter is not realistic, not least of which is that inclusive genetic fitness is only observable in hindsight, after you’re dead.
Makes sense!
I don't me that humans become aligned with their explicit verbal concept of status. I mean that (many) humans are aligned with the intuitive concept that they somehow learn over the course of development.
How do you know that there is any intuitive concept there? For example, if Bob wants to sit at the cool kid’s table at lunch and Bob dreams of being a movie star at dinner, who’s to say that there is a single concept in Bob’s brain, verbalized or not, active during both those events and tying them together? Why can’t it simply be the case that Bob feels motivated to do one thing, and then later on Bob feels motivated to do the other thing?
Well, there's convergent structure in the observed behavior. There's a target that seems pretty robust to a bunch of different kinds of perturbations and initial conditions.
It's possible that that's implanted by a cluge of a bunch of different narrow adaptions. That's the null hypothesis even.
But the fact that (many) people will steer systematically towards opportunities of high prestige, even when what that looks like is extremely varied, seems to me like evidence for an implicit concept that's hooked up to some planning machinery, rather than (only) a collection of adaptions that tend to produces this kind of behavior?
I think you’re responding to something different than what I was saying.
Again, let’s say Bob wants to sit at the cool kid’s table at lunch, and Bob dreams of being a movie star at dinner. Bob feels motivated to do one thing, and then later on Bob feels motivated to do the other thing. Both are still clearly goal-directed behaviors: At lunchtime, Bob’s “planning machinery” is pointed towards “sitting at the cool kid’s table”, and at dinnertime, Bob’s “planning machinery” is pointed towards “being a movie star”. Neither of these things can be accomplished by unthinking habits and reactions, obviously.
I think there’s a deep-seated system in the brainstem (or hypothalamus). When Bob’s world-model (cortex) is imagining a future where he is sitting at the cool kid’s table, then this brainstem system flags that future as “desirable”. Then later on, when Bob’s world-model (cortex) is imagining a future where he is a movie star, then this brainstem system flags that future as “desirable”. But from the perspective of Bob’s world-model / cortex / conscious awareness (both verbalized and not), there does not have to be any concept that makes a connection between “sit at the cool kid’s table” and “be a movie star”. Right?
By analogy, if Caveman Oog feels motivated to eat meat sometimes, and to eat vegetables other times, then it might or might not be the case that Oog has a single concept akin to the English word “eating” that encompasses both eating-meat and eating-vegetables. Maybe in his culture, those are thought of as two totally different activities—the way we think of eating versus dancing. It’s not like there’s no overlap between eating and dancing—your heart is beating in both cases, it’s usually-but-not-always a group activity in both cases, it alleviates boredom in both cases—but there isn’t any concept in English unifying them. Likewise, if you asked Oog about eating-meat versus eating-vegetables, he would say “huh, never thought about that, but yeah sure, I guess they do have some things in common, like both involve putting stuff into one’s mouth and moving the jaw”. I’m not saying that this Oog thought experiment is likely, but it’s possible, right? And that illustrates the fact that coherently-and-systematically-planning-to-eat does not rely on having a concept of “eating”, whether verbalized or not.
This seems like good news about alignment.
To me it sounds like alignment will do a good job of aligning AIs to money. Which might be ok in the short run, but bad in the longer run.
It seems like Evolution did not "try" to have humans aligned to status. It might have been a proxy for inclusive genetic fitness, but if so, I would not say that evolution "succeeded" at aligning humans. My guess is it's not a great proxy for inclusive genetic fitness in the modern environment (my guess is it's weakly correlated with reproductive success, but clearly not as strongly as the relative importance that humans assign to it would indicate if it was a good proxy for inclusive genetic fitness).
Of course, my guess is after the fact, for any system that has undergone some level of self-reflection and was put under selection that causes it to want coherent things, you will be able to identify some patterns in its goals. The difficult part in aligning AIs is in being able to choose what those patterns are, not being able to cohere some patterns at the end of it. My guess is with any AI system, if we were to survive and got to observe it as its made its way to coherence, we would be able to find some robust patterns in its goals (my guess is in the case of LLMs something related to predicting text, but who knows), but that doesn't give me much solace in the AI treating me well, or sharing my goal.
A super relevant point. If we try to align our AIs with something, and they end up robustly aligned with some other proxy thing, we definitely didn't succeed.
But, it's still impressive to me that evolution hooked up general planning capabilities to a (learned) abstract concept, at all.
Like there's this abstract concept, which varies a lot in it's particulars, from environment to environment. And which the brain has to learn to detect it aside from the particulars. Somehow the genome is able to construct the brain such that the motivation circuitry can pick out that abstract concept, after is it learned (or as it is being learned) and use that as a major criterion of the planning and decision machinery. And the end result is that the organism as a whole ends up not that far from a [abstract concept]-maximizer.
This is a lot more than I might expect evolution to be able to pull off, if I thought that our motivations were a hodge-podge of adaptions that cohere (as much as they do) into godshatter.
My point is NOT that evolution killed it, alignment is easy. My point is that evolution got a lot further than I would have guessed was possible.
Why do you highlight status among bazilliion other things that generalized too, like romantic love, curiosity, altruism?
Those are motivations but they don't (mostly) have the type signature of "goals" but rather the type signature of "drives".
I pursue interesting stuff because I'm curious. That doesn't require me to even have a concept of curiosity—it could in principle be steering me without my awareness. My planning process might use curiosity, but it isn't aligned with curiosity, in the sense that we make plans that maximize our curiosity (usually). We just do what's interesting.
In contrast, social status is a concept that humans learn, and it does look like the planning process is aligned with the status concept, in that (some) humans habitually make plans that are relatively well described as status maximizing.
Or another way of saying it. Our status motivations are not straightforward adaption execution. It's recruiting the general intelligence in service of this concept, in much the way that we would want an AGI to be aligned with a concept like the Good or corrigibility.
Romantic love, again people act on (including using their general intelligence), but their planning process is not in general aligned with maximization of romantic love. (Indeed, I'm editorializing human nature here, but it looks to me like romantic love is mostly a strategy to get other goals).
Altruism - It's debatable whether most instances of maximizing altruistic impact are better described as status maximization. Regardless, this is an overriding strategic goal, recruiting general intelligence, for a very small fraction of humans.
Note that this doesn't undermine the post, because it's thesis only gets stronger if we assume that more alignment attempts like romantic love or altruism generalized, because that could well imply that control or alignment is actually really easy to generalize, even when the intelligence of the aligner is way less than the alignee.
This suggests that scalable oversight is either a non-problem, or a problem only at ridiculous levels of disparity, and suggests that alignment does generalize quite far.
This, as well as my belief that current alignment designers have far more tools in their alignment toolkit than evolution had makes me extremely optimistic that alignment is likely to be solved before dangerous AI.
“[optimization process] did kind of shockingly well aligning humans to [a random goal that the optimization process wasn’t aiming for (and that’s not reproducible with a higher bandwidth optimization such as gradient descent over a neural network’s parameters)]”
Nope, if your optimization process is able to crystallize some goals into an agent, it’s not some surprising success, unless you picked these goals. If an agent starts to want paperclips in a coherent way and then every training step makes it even better at wanting and pursuing paperclips, your training process isn’t “surprisingly successful” at aligning the agent with making paperclips.
This makes me way less confident about the standard "evolution failed at alignment" story.
If people become more optimistic, because they see some goals in an agent, and say the optimization process was able to successfully optimize for that, but they don’t have evidence of the optimization process having tried to target the goals they observe, they’re just clearly doing something wrong.
Evolutionary physiology is a thing! It is simply invalid to say “[a physiological property of humans that is the result of evolution] existing in humans now is a surprising success of evolution at aligning humans”.
Status is a way to have power. Aligning an agent to be power-maximizing is qualitatively different from what we want from AI which we want to align to care about our own ends.
If the agent had no power whatsoever to effect the world then it wouldn’t matter if it cared or not.
So the real desire is that it must have a sufficient amount, but not over some threshold that will prove to be too frightening.
Who gets to decide this threhsold?
An AGI can kill you even if it's not beyond what you consider to be "too frightening".
The grading isn't on a scale.
The threshold still has to be greater than zero power for its ‘care’ to matter one way or the other. And the risk that you mention needs to be accepted as part of the package, so to speak.
So who gets to decide where to place it above zero?
I don't think that everybody has the built in drive to seek "high social status", as defined by the culture they are born into or any specific aspect of it that can be made to seem attractive. I know people who just think its an annoying waste of time. Or like myself spent half my life chasing it then found inner empowerment and came to find the proxy of high status was a waste of time and quit chasing.
Maybe related, I do think we all generally tend to seek "signalling" and in some cases spend great energy doing it. I admit I sometimes do, but it's not signalling high status, its just signalling chill and contentedness. I have observed some kind of signalling in pretty much every adult I have witnessed, though its hard to say for sure, its more my assumption of their deepest motivation. The strength of the drive isn't always strong for some people or its just very temporary. There are likely much stronger drivers (e,g, avoiding obvious suffering). Signalling perhaps helps us attract others who align with us and form "tribes", so it can be worth the energy.
It seems that a huge part of "human behaviour is explained by status seeking" is just post hoc proclaiming that whatever humans do is status seeking.
Suppose you want to predict whether a given man will go hang out with friends or work more on a project. How does the idea of status seeking helps? When we already know that the human chose friends we say, yes of course, he get more status around his friend group by spending more time with them, improving their bonds and having good friends is a marker of status in its own right. Likewise, when we know that the man chose work, we can say that this is behaviour that leads towards promotion and more money and influence inside the company which is a marker of high status. But when we want to predict beforehand... I don't think it really helps.
The concept of status helps us predict that any given person is likely to do one of the relatively few things that are likely to increase their status, and not one of the many more things that are neutral or likely to decrease status, even if it can't by itself tell us exactly which status-raising thing they would do. Seems plenty useful to me.
Maybe our culture fits our status-seeking surprisingly well because our culture was designed around it.
We design institutions to channel and utilize our status-seeking instincts. We put people in status conscious groups like schools, platoons, or companies. There we have ceremonies and titles that draw our attention to status.
And this works! Ask yourself, is it more effective to educate a child individually or in a group of peers? The latter. Is it easier to lead a solitary soldier or a whole squad? The latter. Do people seek a promotion or a pay rise? Both, probably. The fact is, that people are easier to guide when in large groups, and easier to motivate with status symbols.
From this perspective, our culture and inclination for seeking status have developed in tandem, making it challenging to determine which influences the other more. However, it appears that culture progresses more rapidly than genes, suggesting that culture conforms to our genes, rather than the reverse.
Another perspective: Sometimes our status seeking is nonfunctional and therefore nonaligned. For example we also waste a lot of effort on status, which seems like a nonfunctional drive. People will compete for high status professions like musician, streamer, celebrity and most will fail, which makes it seem like an unwise investment of time. This seems misaligned, as it's not adaptive.
How are you telling the difference between "evolution aligned humans to this thing that generalized really well across the distributional shift of technological civilization" vs. "evolution aligned humans to this thing, which then was distorted / replaced / cut down / added to by the distributional shift of technological civilization"?
Eye-balling it? I'm hoping commenters will help me distinguish between these cases, hence my second footnote.
Agree. This connects to why I think that the standard argument for evolutionary misalignment is wrong: it's meaningless to say that evolution has failed to align humans with inclusive fitness, because fitness is not any one constant thing. Rather, what evolution can do is to align humans with drives that in specific circumstances promote fitness. And if we look at how well the drives we've actually been given generalize, we find that they have largely continued to generalize quite well, implying that while there's likely to still be a left turn, it may very well be much milder than is commonly implied.
So humans are "aligned" if humans have any kind of values? That's not how alignment is usually used.
Less wrong obsesses about status to an incredibly unhealthy degree.
edit: removed sarcastic meme format.
Is that intended to mean “lesswrong people are obsessed with their own and each other’s status”, or “lesswrong people are obsessed with the phenomenon of human status-seeking”? (or something else?)
The former, by nature of a distorted view on the latter. I don't think status is a single variable, and I think when you split it up into its more natural components - friendship, caring, trust, etc - it has a much more human ring to it than "status" and "value" do, which strike me as ruthless sociopathic businesspeople perspectives. It is true that status is a moderately predictive oversimplification, but I claim that that is because it is oversimplifying components that are correlated in the circumstances where status appears to work predictively. Command hierarchy is itself a bug to fix, anyhow. Differences in levels of friendship, caring, trust, respect, etc should not cause people to form a deference tree, healthy social networks are far more peer to peer than ones that form around communities obsessed with the concept of "status".
I was asking because I published 14,000 words on the phenomenon of human status-seeking last week. :-P I agree that there have been many oversimplified accounts of how status works. I hope mine is not one of them. I agree that “status is not a single variable” and that “deference tree” accounts are misleading. (I think the popular lesswrong / Robin Hanson view is that status is two variables rather than one, but I think that’s still oversimplified.)
I don’t think the way that “lesswrong community members” actually relate to each other is “ruthless sociopathic businesspeople … command hierarchy … deference tree” kind of stuff. I mean, there’s more-than-zero of that, but not much, and I think less of it in lesswrong than in most groups that I’ve experienced—I’m thinking of places I’ve worked, college clubs, friend groups, etc. Hmm, oh here’s an exception, “the group of frequent Wikipedia physics article editors from 2005-2018” was noticeably better than lesswrong on that axis, I think. I imagine that different people have different experiences of the “lesswrong community” though. Maybe I have subconsciously learned to engage with some parts of the community more than others.
[This is post is a slightly edited tangent from my dialogue with John Wentworth here. I think the point is sufficiently interesting and important that I wanted to make it as a top level post, and not leave it buried in that dialog on mostly another topic.]
The conventional story is that natural selection failed extremely badly at aligning humans. One fact about humans that casts doubt on this story is that natural selection got the concept of "social status" into us, and it seems to have done a shockingly good job of aligning (many) humans to that concept.
Evolution somehow gave humans some kind of inductive bias (or something) such that our brains are reliably able to learn what it is to be "high status", even though the concrete markers for status are as varied as human cultures.
And it further, it successfully hooked up the motivation and planning systems to that "status" concept. Modern humans not only take actions that play for status in their local social environment, they sometimes successfully navigate (multi-decade) career trajectories and life paths, completely foreign to the ancestral environment, in order to become prestigious by the standards of the local culture.
And this is one of the major drivers of human behavior! As Robin Hanson argues, a huge portion of our activity is motivated by status-seeking and status-affiliation.
This is really impressive to me. It seems like natural selection didn't do so hot at aligning humans to inclusive genetic fitness. But it did kind of shockingly well aligning humans to the goal of seeking, even maximizing, status, all things considered.[1]
This seems like good news about alignment. The common story that condoms prove that evolution basically failed at alignment—that as soon as we developed the technological capability to route around the evolution's "goal" of maximizing the frequency of your alleles in the next generation, to attain only the proxy measure of sex, we did that—doesn't seem to apply to our status drive.
It looks to me like "status" generalized really well across the distributional shift of technological civilization. Humans still recognize it and optimize for it, regardless of whether the status markers are money or technical acumen or h-factor or military success.[2]
This makes me way less confident about the standard "evolution failed at alignment" story.
I guess that we can infer from this that having an intuitive "status" concept was much more strongly instrumental for attaining high inclusive genetic fitness in the ancestral environment than having an intuitive concept of "inclusive genetic fitness" itself. A human-level status seeking agent with a sex drive does better by the standard of inclusive genetic fitness than a human-level agent IGF maximizer.
The other hypothesis, of course, is that the "status" concept was easier to encode in an human than the "inclusive genetic fitness concept, for some reason.
I'm interested if others think that this is an illusion, if it only looks like the status target generalized, because I'm drawing the target around where the arrow landed. That is, what we think of as "social status" is exactly the parts of social status in the ancestral environment that did generalize to across cultures.