10 min read

12

Follows from:

Map and Territory

Babble and Prune

Warning: I strongly recommend not using the concepts in this sequence to try and build a generalized artificial intelligence. These concepts describe how humans function as conscious entities, and humans are not known for being safe or friendly to human values. Human limitations currently provide the only check on human abuse of power, and building something with human cognitive abilities but without those limitations would be ill-advised. Human.

Introduction

This series of articles is about applied metacognition, laying the groundwork for developing the skills to approach and effectively handle any type of problem or situation.

The concepts in this particular article may already be familiar. I’m presenting them here because we will use them in later articles to derive the different types of skills and explain how those skills work and how they fit into our toolbox.

The Map and the Territory

A previous article, No Really, Why Aren’t Rationalists Winning?, established that skills are highly compressed procedural information. In our sequence premier, The Foundational Toolbox for Life: Introduction, we looked at them from a different angle: skills start as paradigms which filter out information. We develop our paradigms into skills by calibrating them with experience to produce useful answers to problems in a practical time frame.

Now we will look at the map/territory distinction and how we use it to define the basic building blocks of cognition itself. Once we’ve done that, we can finally move on to what we can build with those blocks.

Every skillset, from science to art to athletics to management, requires an explicit or implicit mental model of certain aspects of the world: a map. Every person has at least one map in their brain, which represents the territory that is the real world, or at least the part they deal with. The map lets a person predict the outcomes of their actions, and thereby allows them to effectively navigate the territory and change it in pursuit of their desires. Without the map, there would be no way for a person to predict which options lead to desirable outcomes.

Even primitive life forms have evolved rudimentary maps. Their instincts represent the effects of their potential responses to specific stimuli on the probability that they will survive and reproduce. The correlations encoded in these instincts are a narrow, low-resolution map of the organisms’ native environments.

However, instinct maps are updated by the processes of mutation and natural selection—in other words, chance and death. Each individual is stuck with an instinct map that either succeeds or fails fatally. Humans generally want to improve their models of reality in less lethal ways, so they use their more sophisticated neural hardware to learn about the world and update their maps on the individual and cultural levels rather than on the species genome level.

Order and Chaos

The map’s relationship with the territory creates a fundamental dichotomy that helps define every tool in our toolbox: the duality of order and chaos.

“Order” represents the degree to which the map accurately reflects the territory. This accuracy is measured by how well the map makes predictions. In short, order is what we say we “know”. When we speak of requirements and limits, what must happen or what cannot happen, we are speaking of order.

Additionally, “order” can refer to how easily knowledge and information can be compressed, or how much information we can derive from a small sample size. Patterns across time or space are called “orderly” because knowing only a fraction of the pattern can enable us to predict the rest. For all territories of a given size, the more orderly ones require fewer bits of information to describe them with a map. For example, a bilaterally symmetrical object allows you to predict what is on one side if you have already seen the other side, so you can describe it in full by showing only one side and defining the plane of symmetry. The map of a particularly orderly territory might translate to a few sample data points and a relatively simple rule.

By contrast, “chaos” represents the omissions and errors in the map, the degree to which the map fails to accurately represent the territory. Chaos is the “unknown”. When we speak of possibilities and uncertainties, of what may or may not happen, we are speaking of chaos.

The unknowns of chaos includes both unknown unknowns (pure chaos) and known unknowns (chaos bounded by order). Pure chaos manifests as outside context problems or black swan events, like being invaded by a continent you never suspected existed. However, much of the chaos that adult humans experience is bounded by order. Although they don’t know exactly what will happen, they feel fairly certain it will fall within a range of “normal” events. The roll of a die provides a more specific example of bounded chaos, since we know every possible face value even if we don’t know which one it will be. A trusted probability distribution also imposes some certainty on unpredictable outcomes, at least with large sample sizes. Even if individual measurements may vary, we know roughly what the data on the group as a whole will look like.

Whenever something you thought you knew turns out to be false or incomplete, that is chaos as well. The truest knowledge of the territory is limited to our scattered data points of direct experience, and we create our maps to interpolate and explain those data points as best we know how. Whenever we get a new data point that falsifies the map we were using, when we try to predict the territory and fail, it is another manifestation of chaos.

Moreover, chaos can refer to how difficult it is to compress information, or to figure out the details of a situation from limited data. A situation described as “chaotic” is difficult to predict because the information you have about it cannot be used to derive the information you want. For example, in a messy room, the knowledge of one sock's location does not allow you to locate its pair.

Although (or because) they are opposite concepts, order and chaos are more or less inextricable. The known and unknown are present to varying degrees in almost every situation you encounter, because they are essential to conscious existence as we know it. We often say that perfectly certain knowledge of the territory is impossible, but we don’t classify everything as completely unknown, either. Instead of a binary label of “known” or “unknown,” we have gradients of certainty that inform how much of our resources and safety we are willing to bet on various unknowns.

As a technical explanation of these concepts, “chaos” simply describes a relatively smooth and somewhat even distribution of probability mass across a range of hypotheses, where no hypothesis in the range is considered overwhelmingly more likely than another. “Order” describes a sharper, uneven distribution where probability mass is concentrated into a relatively small number of hypotheses. If (as usual) you have a subset of hypotheses that are overwhelmingly more likely than all others but roughly equal in probability with each other, that’s bounded uncertainty: chaos bounded by order (or chaos inherent in order, depending on which one you want to imply is dominant).

You may have gathered that order and chaos are also subjective in their application. A territory cannot be intrinsically “orderly” or “chaotic” without reference to a given map (or compression algorithm). A situation will appear more or less chaotic to you in proportion to your ignorance of it. After all, confusion (or lack thereof) is in the map.

Order and chaos are also implicitly based on what information people consider important. The die roll mentioned above is “bounded by order” because it has a finite number of defined results. However, the reason the number of defined results is so low is because we don’t pay any attention to the location at which the die comes to rest, or the direction it faces, or the amount of time it takes to stop rolling…

The fact that these definitions of order and chaos are relative rather than objective is to be expected, because all the concepts in our toolbox are based on solving problems. Problems can only be defined in terms of a person’s desires, what sorts of obstacles stand in the way of those desires, and what the person can do to overcome those obstacles.

The next section deals with how our minds process order and chaos. Understanding how we deal with the shape of what we know and what we don’t know (and what we don’t know we don’t know) is vital for describing how our skills work.

Guessing and Checking

At the most fundamental level of mental activity that is still complex enough to be recognized as mental, we find two processes. These processes explore chaos and order, respectively, so that the mind can develop and refine its map. I call these processes “guessing and checking”. Elsewhere on this site, they are known as “babble and prune”. As far as I can tell, these phrases refer to the same pair of concepts.

Guessing is more or less free association: it links our current experiences and thoughts with any concepts that are remotely similar, and calls our attention to those concepts. To guess is to throw one’s map up against the territory in various ways (without judging the results—that’s where check comes in). Guessing is the mind wrangling chaos. It follows possibilities based on an initial idea and makes them concrete in the mind. It allows the mind to model (and therefore address) the unknown territory by giving shapes to the potential that lurks within.

Checking is the process by which we judge whether a concept is relevant to the current situation. It evaluates how we are applying the map to the territory, and the predictions we make from it, by comparing them to other observations of the territory or to memories that our guessing has summoned up. Based on this evaluation, the check accepts or rejects the accuracy (predictive utility) of the concept in the given situation. Checking is the mind wrangling order, because it decides what information gets to become and remain part of the map. It allows the mind to produce and curate knowledge by judging how well the map of the known matches the territory (or other parts of the map) in the way that guessing has applied it.

To illustrate how these cognitive processes work, we can look at what happens when each of them is shut off.

All guessing and no checking would be like an incoherent dream, or (people tell me) the effects of some recreational drugs: a parade of random impressions. It would consist of complete free association, but nothing for assessing correspondence with reality and filtering out what doesn't fit.

Inversely, all checking and no guessing would allow one to apply a single concept, but one would have no ability to update the paradigm of how to apply it. For instance, an entity with checking but no guessing might be able to classify organisms as cats or dogs, but it wouldn't be able to realize that some organisms are neither (unless it already had a label for that).

From these examples, it is clear that both guessing and checking are necessary aspects of any skill, because both are necessary to generate and calibrate our maps.

As a more technical explanation of guessing and checking, guessing iterates through locations in hypothesis space. However, hypothesis space—the space of possible maps—is theoretically infinite, with infinite dimensions, and is therefore non-ordered (that is, it doesn’t have a linear sequence). To make iteration through unlimited possibilities computationally tractable, our brains use free association. The brain keys off of immediate sensory inputs or thoughts to find possibly related concepts, then keys off of those concepts to find more distantly related concepts, and so on. Through this process, guessing takes us to the most salient-seeming locations in hypothesis space. Each location visited, correct or not, is added to a map of possibilities as a candidate for representing the territory.

As we guess each hypothesis in turn, checking will accept or reject the hypothesis with varying degrees of confidence by pumping probability mass into or out of it, redistributing the probability mass across the range of hypothesis space we are exploring. It performs this redistribution by updating our map of possibilities, revising the degrees of certainty across the board based on its evaluation of each option.

Naturally, if we examine all the major hypotheses and decide they're still equally likely, then our pumping has canceled itself out. If our checking, based on our prior probabilities, has decided they all match quite well, then we’re still undecided. If checking decides the hypotheses all match equally poorly, it may be that something improbable happened, or that our guessing didn’t go far enough to come up with a more likely hypothesis, or that our check rejected something in error because we were ignorant of a factor making it more probable.

Distinct and Subliminal

There’s one other dichotomy that we need to finish laying the groundwork for the basic skills: distinct versus subliminal.

When a guessing or checking activity takes place in our brains, it can do so in two modes.

First, it can run distinctly, where there is an explicit record (a memory) of the iteration process and we are fully aware of what possibilities or implications we are considering. We refer to distinct processes as taking place in our System 2, which might also be called the “manual” system.

Second, guessing or checking can run subliminally (“below the threshold”) where there is no explicit record of the process, and we are only left aware of the end result, if even that. We frequently form beliefs and decisions based on subliminal processes without realizing we’ve done so, to our detriment or benefit. We say these processes are part of System 1, often called the “automatic” system.

The modes in which our guessing and checking processes run determine what sort of maps we use and how we update them. The maps we use define the types of skills that we employ, what aspects of situations they deal with, and their advantages and disadvantages. We’ll go into more detail on these maps and what happens when cognitive processes run in various modes in the next part of the sequence.

Conclusion

All skills involve both guessing and checking in order to update our maps. What differentiates one skill from another is how each process runs: distinctly or subliminally. These modes of guessing and checking inform what sorts of features a skill’s map contains and therefore what aspects of the territory its map represents. The dichotomy of order and chaos, describing the degree to which a map corresponds with a territory, is a core concept for distinguishing the different tools in our toolbox.

In the next article I’ll introduce the basic mindsets and explain what sorts of maps they use and how they are defined by how they guess and check.

New Comment
4 comments, sorted by Click to highlight new comments since:

I'm confused, you seem to describe a very elaborate model of cognition, yet I can find no literature review, no testable predictions and no experimental results to ground this model in something observable. What is this model based on?

The first part seems to mostly be building off of Eliezer's technical explanation of a technical explanation. And it does successfully link to it. 

The second part seems to build on Babble and Prune, which was a previous sequence here, but also makes a very large number of predictions in a relatively straightforward fashion. Many of those predictions are also covered in the Babble and Prune sequence, though the basic idea of "part of human cognition works by generating hypotheses though some stochastic process, and then filtering out the bad ones" is very widespread in cognitive psychology. 

As a random example, take this paper: http://gershmanlab.webfactional.com/pubs/Dasgupta17.pdf

It says: 

In general, probabilistic inference is comprised of two steps: hypothesis generation and hypothesis evaluation, with feedback between these two processes.

This assumption is widely shared in really broad swaths of cognitive science.

I confess, your comment surprised me by calling for a different epistemic standard than I figured this article required. I had to unpack and address several issues, listed below.

  1. I can make a bibliography from the links I’ve already included, if it would help.
  2. Are there any specific assertions in this article that you think call for more evidence to support them over the alternatives?
  3. This article is meant to build the foundation for explaining the concepts that we'll be working with in the next article. After that article, we'll mostly be using those concepts instead. Those will be supported by your own observations of how people learn different skills with varying degrees of difficulty.
  4. I didn't know how much of the theory I was building on would be taken as a given in this community, so I decided to just post and see what wasn't already part of the general LW paradigm. I’d like to hear from more people before I make any judgment calls.
  5. These ideas at this point in the sequence are not intended to make new predictions that would require the introduction of new evidence. They are intended to help the reader more clearly and efficiently conceptualize the information they already have. This article asserts that some ideas are conceptually distinct from each other and others aren’t, which is not an empirical issue. The technical terms I introduce in the article are a condensation and consolidation of existing ideas, so that people can more easily process and apply new information. I predict that as I continue to explain the paradigms I’ve developed, they will be consistent with each other and with empirical evidence, and that the reader will develop a more elegant perspective which will allow them to apply their knowledge more effectively. It may be that I need to make that more clear in future articles.
  6. In order to think effectively, there are many concepts we can and must learn and apply without relying on the scientific establishment to do experiments for us.

Does that all make sense? I'll work on framing future articles so that it's clear when they are making empirical predictions from evidence and when they are presenting a concept as being better than other concepts at carving reality at its joints.

This post came across to me as mostly speculative but trying to be academic, I may well be wrong. Habryka in the other comment suggested that your claims have some grounding that I was not aware of. Additionally, I do not subscribe to the local lore of Eliezer's contrarianism and extreme Bayesianism. The metaphor of "reality joints", or "reality fluid", falls flat for me, as well. If you perspective is different, then feel free to disregard my comment, it's not like you and I can square our epistemic views in a comment thread.