Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Let’s split out two different types of optimization. The first type includes things like Google Maps’ pathfinder, a human making a plan, or Amazon’s logistical algorithms. The second type includes things like a thermostat, a heat-seeking missile, or water flowing downhill.

The distinction I want to point to here is where the things-we’re-optimizing-over live. For the first type, optimization is over things-in-a-model; a search algorithm runs on a map. For the second type, optimization is over things-in-the-world; a search algorithm runs on the territory directly. Search-in-map vs search-in-territory.

Same Algorithm, Different Inputs

A toy example: suppose we have a big pile of rocks, and we want to find the rock whose mass is closest to the mass of a reference weight (without going over).

A search-in-territory algorithm might use one of those old-school balance-scales to compare masses pairwise. We could pull each rock out of the pile one-by-one, and:

  • First, compare the rock to the reference weight. If it’s heavier, throw it away and move on to the next rock.
  • Second, compare it to the best rock found thus far, and replace the best rock with this one if it’s heavier.

At the end, we’ll have chosen the best rock.

A balance-scale. (Source)

A search-in-map algorithm might instead start by weighing the reference weight and each rock on a scale, and entering all the masses into a spreadsheet. But then, it could proceed exactly like the search-in-territory algorithm. It would run through the list of rock-masses one-by-one, and:

  • First, compare the rock-mass to the reference mass. If the rock-mass is larger, move on to the next one.
  • Second, compare the rock-mass to the best rock-mass found thus far, and replace the best rock-mass with this one if it’s larger.

At the end, there is one extra step: we have to go back out into the world and find the actual rock with the best mass. Note that this could be nontrivial, if we didn’t label or organize our rocks in the process of weighing them.

Key point of this example: these two types of optimization need not involve different algorithms. In practice they often do - things we typically think of as “search algorithms” tend to be used more for search-in-map, and things we typically think of as “controllers” tend to be used more for search-in-territory. But in principle, one can often apply the algorithms opposite their usual context.

When Should Search-In-Map Be Favored?

I see two main features of search-in-map which aren’t (typically) shared by search-in-territory:

  • The map-making process can use information before the search process “knows what to do with it”. For instance, in the rock-pile example, we could weigh all the rocks before we gain access to the reference rock, or even before we know what the task is at all.
  • The map itself can use a convenient representation or data structure - e.g. indexes, caching, sorted lists, etc.

These features can sometimes be replicated to some extent in the territory - e.g. we could weigh all the physical rocks and store them in sorted order before gaining access to the reference weight. But this only works when we have lots of control over the physical system, and can arrange it how we please. A map can pretty much always be arranged how we please.

Key point: if we can use information to build a map before we have full information about the optimization/search task, that means we can build one map and use it for many different tasks. We can weigh all the rocks, put that info in a spreadsheet, then use the spreadsheet for many different problems: finding the rock closest in weight to a reference, finding the heaviest/lightest rock, picking out rocks which together weigh some specified amount, etc. The map is a capital investment.


A more typical example: creating a streetmap is expensive. If we just want to figure out one shortest path, then directly measuring physical distances along a bunch of paths might be faster. But once the streetmap is created, it can be used repeatedly by lots of different people for lots of different pathfinding problems.

When Should Search-In-Territory Be Favored?

The key feature of search-in-territory is that it does not require making a map - and therefore requires no extra information or assumptions beyond what the algorithm itself uses.

In the rock-pile example, the search-in-territory uses only pairwise comparisons between weights - i.e. a balance scale. We never actually know the mass of any particular rock. The search-in-map, on the other hand, relies on direct measurements of mass. To perform the search-in-map with only a balance scale, we’d either need to compare all pairs of weights ahead of time (which would mean  effort), or we’d need to run out and compare physical weights in the middle of the search (at which point we’re effectively back to search-in-territory).

Key point: if we don’t rely on extra information, then we don’t rely on extra assumptions; there are no extra sources of error. In the rock-pile search-in-map example, error could come from the scale becoming uncalibrated as we proceed through the rocks, or from insufficient precision in the measured masses, or from data corruption in the spreadsheet, or from failing to connect the search result to the right physical rock at the end. We have to assume that we correctly understand each of those pieces. For the search-in-territory version, we only need to assume that the balance scale works correctly.

A more typical example: many feedback controllers work well even when our model of the system’s dynamics isn’t quite correct.

How Does This Compare To Selection Vs Control?

Abram’s Selection vs Control offers a similar distinction between two kinds of optimizers. “Selectors”, in his breakdown, directly instantiate and compare objects in the search space. There’s an explicit space of “options” or “possibilities” over which the search operates. This includes all of the search-in-map examples from earlier, but also biological evolution.

“Controllers” don’t necessarily have that explicit search-space - they tend to operate directly on the external world. “Other possibilities” for a controller would mean other counterfactual worlds, which aren’t necessarily unambiguously defined. This includes all of the search-in-territory examples from earlier.

I claim that search-in-territory vs search-in-map is usually the right distinction to think about when considering the intuitive selection vs control clusters. The main reason is that counterfactuals always live in a model. An objectively defined “search space” exists only in a model.

There are cases where certain search spaces/counterfactuals seem particularly natural. For instance, in the case of biological evolution, it seems natural to consider the space of DNA sequences. But remember that evolution does not “actually search” this space - it only samples a relatively-small chunk of the exponentially large space of sequences.

Conversely, there are cases where we think of a search-in-territory as having some search space. For instance, an optimal controller might minimize some expected “cost” under some model of the system under control, but without explicitly representing that map or optimizing over it.

Why Does This Matter?

As with the selection-vs-control distinction, it intuitively feels like search-in-map is “more powerful” than search-in-territory; it looks less like a thermostat and more like an agent. But if they both do optimization (i.e. they both steer certain parts of the universe into a smaller set of states, which is equivalent to expected utility maximization in a God's-eye model), then what’s the difference?

The discussion above suggests one possible answer: maps generalize. This connects to the (Improved) Good Regulator Theorem: a system “should” use an efficient internal map when it “doesn’t know what game it’s playing” until later. In that case, we need to keep around useful information, but we still want to compress and precompute where possible.

Then the interesting question is: how do we recognize a map embedded in the territory, or a search algorithm running on such a map?

New to LessWrong?

New Comment
14 comments, sorted by Click to highlight new comments since: Today at 10:47 AM

Note: I think that this is a better written-version of what I was discussing when I revisited selection versus control, here: https://www.lesswrong.com/posts/BEMvcaeixt3uEqyBk/what-does-optimization-mean-again-optimizing-and-goodhart (The other posts in that series seem relevant.)

I didn't think about the structure that search-in territory / model-based optimization allows, but in those posts I mention that most optimization iterates back and forth between search-in-model and search-in-territory, and that a key feature which I think you're ignoring here is cost of samples / iteration. 

Recently I've also been thinking about something that seems vaguely related, which could perhaps be called inference in the map vs inference in the territory.

Suppose you want to know how some composite system works. This might be a rigid body object made up of molecules, a medicine made out of chemicals to treat a disease that is ultimately built out of chemicals, a social organisation method designed for systems made out of people, or anything like that.

In that case there are two ways you can proceed: either think about the individual components of the system and deduce from their behavior how the system will behave, or just build the system in reality and observe how the aggregate behaves.

If you do the former, you can apply already-known theory about the components to deduce it's behavior without needing to test it in reality. Though often in practice this theory won't be known, or will be too expensive to use, or similar. So in practice one generally has to investigate it holistically. But this requires using the territory as a map to figure it out.

(When investigating it holistically there is also the possibility of just using holistic rather than reductionistic theories. Often this holistic theory will originate from one of the previous methods though, e.g. our math for rigid body dynamics comes from actual experience with rigid bodies. Though also sometimes it might come from other places, e.g. evolutionary reasoning. So my dichotomy isn't quite as clean as yours, probably.)

I'm not convinced there's an actual distinction to be made here.

Using your mass comparison example, arguably the only meaningful different between the two is where information is stored. In search-in-map it's stored in an auxiliary system; in search-in-territory it's embedded in the system. The same information is still there, though, all that's changed is the mechanism, and I'm not sure map and territory is the right way to talk about this since both are embedded/embodied in actual systems.

My guess is that search-in-map looks like a thing apart from search-in-territory because of perceived dualism. You give the example of counterfactuals being in the map rather than the territory, but the map is itself still in the territory (as I'm sure you know), so there's no clear sense in which counterfactuals and the models that enable them are not physical processes. Yes, we can apply an abstraction to temporarily ignore the physical process, which is maybe what you mean to get at, but it's still a physical process all the same.

It seems to me maybe the interesting thing is whether you can talk about a search algorithm in terms of particular kinds of abstractions rather than anything else, which if you go far enough around comes back to your position, but with more explained.

It seems to me maybe the interesting thing is whether you can talk about a search algorithm in terms of particular kinds of abstractions rather than anything else, which if you go far enough around comes back to your position, but with more explained.

+1

This is a very interesting distinction. Notably, I feel that you point better at a distinction between "search inside" and "search outside" which I waved at in my review of Abram's post. Compared with selection vs control, this split also has the advantage that there is no recursive calls of one to the other: a controller can do selection inside, but you can't do search-in-territory by doing search-in-map (if I understand you correctly).

That being said, I feel you haven't yet deconfused optimization completely because you don't give a less confused explanation of what "search" means. You point out that typically search-in-map looks more like "search/optimization algorithms" and search-in-territory looks more like "controllers", which is basically redirecting to selection vs control. Yet I think this is where a big part of the confusion lies, because both look like search while being notoriously hard to reconcile. And I don't think you can rely on let's say Alex Flint's definition of optimization, because you focus more on the internal algorithm than he does.

Key point: if we can use information to build a map before we have full information about the optimization/search task, that means we can build one map and use it for many different tasks. We can weigh all the rocks, put that info in a spreadsheet, then use the spreadsheet for many different problems: finding the rock closest in weight to a reference, finding the heaviest/lightest rock, picking out rocks which together weigh some specified amount, etc. The map is a capital investment.

One part you don't address here is the choice of what to put in the map. In your rock example, maybe the actual task will be about finding the most beautiful rock (for some formalized notion of beautiful) which is completely uncorrelated with weight. Or one of the many different questions that you can't answer if your map only contains the weights. So in a sense, search-in-map requires you to know the sort of info you'll need, and what you can safely throw away.

 

On the thermostat example, I actually have an interesting aside from Dennett. He writes that the thermostat is an intentional system, but that the difference with humans, or even with a super advanced thermostat, is that the standard thermostat has a very abstract goal. It basically have two states and try to be in one instead of the other, by doing its only action. One consequence is that you can plug the thermostat into another room, or to control the level of water in a tub or the speed of a car, and it will do so.

From this perspective, the thermostat is not so much doing search-in-territory than search-in-map with a very abstracted map that throw basically everything.

It seems to me like search in territory (SIT) and search in map (SIM) are matters of degree, not kind. So they can potentially be quantified. They also have to do with transduction from one form of information to another.

For example, with the SIT example, you’re transducing information from scale balance and rock position into and out of brain states. With the SIM example, you transducer information from your brain, into a pre-designed spreadsheet, then from scale balance and rock position into your brain, into a spreadsheet, and then back to rock position.

It doesn’t seem like there’s a hard distinction between the two from that perspective? Not sure.

I'm not sure if these are examples of the thing you're talking about or something else, but:

Consider a missile that's guided by GPS until it reaches its rough target location, then uses sensors to locate the target precisely. (Though arguably this is simply "SIM followed by SIT".)

Or consider when I do something similar myself. I use the map on my phone screen to guide me to roughly where I want to be, and then I use my eyes to guide me to exactly where I want to be. And I don't just switch from SIM to SIT; I keep checking with both, in case e.g. I miss it and go too far.

Those are nice examples/test cases!

Here's what I think is the right way to understand what's going on in the phone case. Let's say you're looking for an ice cream stand in a park.

Your brain takes input from the phone and your eyeballs. It synthesizes them, along with memories and other sense data, into a prediction about where you should walk and what you should look at in order to find the ice cream stand. Based on that mental synthesis, it sends outputs to your body, causing you to walk/read/look around.

In this conception, there's ultimately only "search in map," where the map is in your brain. "Search in territory" is just a fancy label we give to a certain subset of sense impressions that aren't focusing on what we conventionally call a "map."

I think that John is interested here in this distinction from a more practical, engineering perspective. When is it efficient for some instrumental goal to create or consult what we'd conventionally call a "map?" Here, the important thing seems to be the distinction between accumulating and storing information in a legible format, versus acquiring data anew each time.

I'm just pointing out that ultimately, there has to be some abstract synthesis of signals. The idea of transducing signals from one form into another might be more helpful for understanding this side of things. Here, the important thing is tracing the transduction of information from one processing mechanism to another.

To me, these seem importantly different, so I'm advocating that they be split apart rather than lumped together.

To perform the search-in-map with only a balance scale, we’d either need to compare all pairs of weights ahead of time (which would mean  effort), or we’d need to run out and compare physical weights in the middle of the search (at which point we’re effectively back to search-in-territory).

nit: (I think you maybe meant this but glitched while writing) in this particular example we could do better by indexing (in map) the rocks by weight order ( map-building comparisons). Then once we have the reference rock we can effectively blend our map with in-territory search for only in-territory comparisons. It's more costly overall (by a log factor) to build this map, but if we have map-building budget in advance it yields much faster solving (log instead of linear). Or if the reference rock was one of the original rocks (we just didn't know which one), as long as our index has constant-time access we can do search in-map once the appropriate reference rock is pointed out.

I think this just corroborates your claim

The map-making process can use information before the search process “knows what to do with it”.

I think this raises an interesting further question, especially when we don't know what the task will be ahead of time: how many (and what? and at what resolution?) indices should we ideally spend 'prep' time (and memory) on? (This was a professional concern of mine for several years as a software engineer haha)

Echoes of your gooder regulator theorem

A very basic yet, to my mind, novel and profound distinction. Thank you, John!

Interesting.

Technicality:

A toy example: suppose we have a big pile of rocks, and we want to find the rock whose mass is closest to the mass of a reference weight (without going over).

A search-in-territory algorithm might use one of those old-school balance-scales to compare masses pairwise. We could pull each rock out of the pile one-by-one, and:

First, compare the rock to the reference weight. If it’s heavier, throw it away and move on to the next rock.

Second, compare it to the best rock found thus far, and replace the best rock with this one if it’s heavier.

At the end, we’ll have chosen the best rock.

Wrong, we have the closest rock iff the closest rock is lighter than the weight.

We're want the rock that's closest but not higher, and that's what we get. We don't necessarily get the closest rock, but we do get the best one, which is what John said.

Ups, missed that. Thanks.

It doesn't change the point being made, but:

To perform the search-in-map with only a balance scale, we’d either need to compare all pairs of weights ahead of time (which would mean O(n^2) effort)

So long as "is heavier than" is a transitive relationship (so that finding A>B and B>C lets you know that A>C without having to actually weigh them against each other), you would only need O(n log n) pairwise comparisons to put your rocks into sorted order.