This is a special post for quick takes by interstice. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
2 comments, sorted by Click to highlight new comments since:

The brain seems to have components that are like big neural nets -- giant opaque blobs of compute optimized for some reward function. It also seems to have both long and short-term memory systems which mostly just store information for the neural-net-like systems to manipulate, similar to RAM and hard-drive. If near-term AGI is like this, there will be two types of mesa-optimizer that can arise -- optimizers arising somewhere inside the big neural net, or optimizers that arise from an algorithm carried out using the memory systems. The prefrontal cortex may be an example of the former in humans. The implementation of explicit rules to improve decision making, such as EU maximization or Bayesianism, is an example of the latter(h/t to the ELK report)

It recently occurred to me that humans' apparent tendency to seek status could emerge without any optimization for such, conscious or subconscious, being built-in to the brain at all. Instead, it could be an emergent consequence of our tendency to preferentially attend to and imitate certain people over others. According to The Secret of Our Success, such imitation can extend down to very low-level patterns of behavior, such as what foods we enjoy eating. So you could imagine peoples' behavior and personalities being determined by a sort of 'attentional darwinism': patterns of behavior that tend to get paid attention to and imitated will become common in the population, while those that do not will dwindle. The end result of this will be that an average person's personality will look approximately like a imitation-optimizer --aka status-seeker -- just like an average organism will look approximately like a fitness optimizer. This would make humans doubly mesa-optimizers, both of status-evolution and gene-evolution. This suggests that extracting a CEV of all humanity might be hard, since many of our terminal values could be local to our particular culture's status-evolution.