[Epistemic Status: Early layers and the general framework seems firm (and substantially Dennett-inspired), later parts are more speculative, but I've found them to be useful thinking tools.]
Each of these is a directed search, or optimization process, with all but the first running in a framework created by a larger, less directed, search. This pattern, where an evolution-like search builds another evolution-like search, seems ubiquitous when studying intelligence.
Within each layer of process, there is a Generator and a Test, which create and filter ways of responding to stimuli, respectively. The generator creates a population of possible outputs to sample from randomly (though weighted by prior search processes), while the test checks the generated output against some criteria to see whether to use it (and weight it higher in the population for future generations).
If you take one thing away from this post, make it:
All directed evolutions were created by the fruits of earlier, usually less directed, evolutions. Aside from the lessons about how to explore picked up by prior searches, they must all operate via chance, because at the foundations chance is the only thing available to build with. As lessons are accumulated and better evolutions are built, the searches become faster and more efficient.
These processes run purely on biological evolution: they generate via genetic mutation, and test by natural selection. Without the higher layers, they are highly restricted in their ability to handle novel situations, since in order to perform well their ancestors must have been exposed to so many similar enough scenarios that they pruned away genetic variants unable to deal with the situation. The advantage of these processes is that they require no learning during an individual's lifetime, it's all there from the start.
The plant summoning parasitic wasps to defend itself from caterpillars from the intro is a result of this category of optimization; there is no reasonable way for trial and error during the organism's life to pick out which volatiles to release, it just executes the program given to it by evolution's work.
In order to become more general a Darwinian process can build a learning system, shaped by a reward. A learning system functions like an internal version of natural selection acting on behaviours, rather than genomes. This gives much greater flexibility, since now the organism can remix behaviours which have proven themselves useful during its lifetime, and work with much greater bandwidth (since genetic inheritance maxes out at far less data than total sensory input).
Here the generator is the pool of affordances (formed of both genetic predispositions and previously learned behaviour), and the test is whether the reward centre likes what happens when it does the action. The reward function is grounded in a Darwinian process, but may be updated by experiences.
Now we're getting to the exciting part: Learning not just from your own experiences, but from the combined and filtered experiences of all the systems which you can observe! (and the systems they observed, and so on)
Here, as before, the new behaviours are generated from affordances (some coming from imitated behaviour). But, crucially, they are expressed by creating a physical incarnation of the meme (e.g. a word, artefact, or ritual) in a way which allows other minds to observe and mimic it, and they are tested by their ability to spread from mind to mind in the population. The winners of memetic competition, such as those around which plants to pick passed on to the child in the intro, take up residence in much of the population.
Since these replicators are selected purely on their ability to spread, without necessarily providing value for their hosts, there is room for evolutionary arms races where genes attempt to filter out memes which are harmful to genetic fitness and consistently adopt those which are beneficial. One obvious heuristic is to preferentially copy those who are high-status or successful. Memes in turn adapt to this changing landscape of preferences and defences.
With the ability to develop quickly (thanks to running on a live learning system), having a much more powerful search than any individual could support (thanks to being able to learn indirectly from a broad and deep pool of experiences), and an reflectively adapting fitness landscape (since the memes themselves are a major aspect of their selective environment), memetics can support all sorts of fun processes which the previous layers could only dream of!
Crucially, some memes are about other memes, containing representations of parts of the memetic ecosystem which can be immediately used by other cognitive processes in novel ways. Some of the features which arise on this layer are:
Together the power of these optimization processes seem to explain a large portion of humanity's current place in the world.
We're already building new generate and test systems on computational hardware, with the clearest current example being Generative Adversarial Networks, with the aptly named generator network and a test implemented as the discriminator network. Many other forms of machine learning can be viewed through this lens.
It looks like at some point we will create a system which is capable of recursive self-improvement, increasing in capabilities up to the point where it is self-sustaining and has the power to kick away the ladder of optimization processes which brought it into existence. Hopefully we will design one which decides to keep us around.
It's hard to reliably speculate about what this system will look like in detail, but perhaps it will have a generator of candidate self-modifications or possible new versions, and a set of learned heuristics and reasoned proofs about which to explore further or implement, as its test function.
Mesa-optimization corresponds to the risk of accidentally creating an extra layer of generate and test which is not perfectly aligned with the base process. You can view each of the transitions between the previous classes of optimization process as examples of this.
Right now, you are:
This set of concepts has felt to me like it gave me a framework with which to think about a large topic without being constantly confused. In that respect, I've found it similar to Tegmark's mathematical universe model and my ability to think about reality, or Predictive Processing and my ability to think about how minds work. Except here, the domain is optimization processes which are built by other optimization processes.
An exercise to see whether you've absorbed the model: In the comments, try to come up with an example of as many classes of process as you can, with the generator and test specified.
And happy to answer questions :)
Thanks to sudonym from the Rob Miles Discord for helpful feedback!
though in practice the goal is often closer to "be the source of a virulent meme" than anything as prosocial as those examples
An aside as to why this may be: People who are hosts of memes which are highly optimized for creating and spreading memes (such as bloggers, musicians, or politicians) could be expected to have a disproportionate impact on the population of memes, and these hosts would tend to be spreading memes connected to the goal of spreading their memes alongside any object level content. One effect of this may be that an unexpectedly high proportion of people have adopted ideas useful for trying to be meme fountains.