Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Financial status: This is independent research, now supported by a grant. I welcome further financial support.

Epistemic status: This is in-progress thinking.


Friends, I know that it is difficult to accept, but it just does not seem tenable that knowledge consists of a correspondence between map and territory. It’s shocking, I know.

There are correspondences between things in the world, of course. Things in the world become entangled as they interact. The way one thing is configured -- say the arrangement of all the atoms comprising the planet Earth -- can affect the way that another thing is configured -- say the subatomic structure of a rock orbiting the Earth that is absorbing photons bouncing off the surface of the Earth -- in such a way that if you know the configuration of the second thing then you can deduce something about the configuration of the first. Which is to say that I am not calling into question the phenomenon of evidence, or the phenomenon of reasoning from evidence. But it just is not tenable that the defining feature of knowledge is a correspondence between map and territory, because most everything has a correspondence with the territory. A rock orbiting the Earth has a correspondence with the territory. A video camera recording a long video has a correspondence with the territory. The hair follicles on your head, being stores of huge amounts of quantum information, and being impacted by a barrage of photons that are themselves entangled with your environment, surely have a much more detailed correspondence with your environment than any mental model that you could ever enunciate, and yet these are not what we mean by knowledge. So although there is undoubtedly a correspondence between neurologically-encoded maps in your head and reality, it is not this correspondence that makes these maps interesting and useful and true, because such correspondences are common as pig tracks.

It’s a devastating conclusion, I know. Yet it seems completely unavoidable. We have founded much of our collective worldview on the notion of map/territory correspondence that can be improved over time, and yet when we look carefully it just does not seem that such a notion is viable at the level of physics.

Clearly there is such a thing as knowledge, and clearly it can be improved over time, and clearly there is a difference between knowing a thing and not knowing a thing, between having accurate beliefs and having inaccurate beliefs. But in terms of grounding this subjective experience out in objective reality, we find ourselves, apparently, totally adrift. The foundation that I assumed was there is not there, since this idea that we can account for the difference between accurate and inaccurate beliefs in terms of a correspondence between some map and some territory just does not check out.

Now I realize that this may feel a bit like a rug has been pulled out from under you. That’s how I feel. I was not expecting this investigation to go this way. But here we are.

And I know it may be tempting to grab hold of some alternate definition of knowledge that sidesteps the counter-examples that I’ve explored. And that is a good thing to do, but as you do, please go systematically through this puzzle, because if there is one thing that the history of the analysis of knowledge has shown it is that definitions of knowledge that seem compelling to their progenitors are a dime a dozen, and yet every single one so far proposed in the entire history of the analysis of knowledge has, so far as I can tell, fallen prey to further counter-examples. So please, be gentle with this one.

You may say that knowledge requires not just a correspondence between map and territory but also a capacity for prediction. But a textbook on its own is not capable of making predictions. You sit in front of your chemistry textbook and ask it questions all day; it will not answer. Are you willing to say that a chemistry textbook contains no knowledge whatsoever?

You may then say that knowledge consists of a map together with a decoder, where the map has a correspondence with reality, and the decoder is responsible for reading the map and making predictions. But then if a superintelligence could look at an ordinary rock and derive from it an understanding of chemistry, is it really the case that any ordinary rock contains just as much knowledge as a chemistry textbook? That there really is nothing whatsoever to say about a chemistry textbook that distinguishes it from any other clump of matter from which an understanding of chemistry could in principle be derived?

Suppose that one day an alien artifact landed unexpectedly on Earth, and on this artifact was a theory of spaceship design that had been carefully crafted so as to be comprehensible by any intelligent species that might find it, perhaps by first introducing simple concepts via literal illustrations, followed by instructions based on these concepts for decoding a more finely printed section, followed by further concepts and instructions for decoding a yet-more-finely-printed section, followed eventually by the theory itself. Is there no sense in which this artifact is fundamentally different from a mere data recorder that has been travelling through cosmos recording enough sensor data that a sufficiently intelligent mind could derive the same theory from it? What is it about the theory that distinguishes it from the data recorder? It is not that the former is in closer correspondence with reality than the latter. In fact the data recorder almost certainly corresponds in a much more fine-grained way to reality than the theory, since in addition to containing enough information to derive the theory, it also likely contains much information about specific stars and planets that the theory does not. And it is not that the theory can make predictions while the data recorder cannot: both are inert artifacts incapable of making any prediction on their own. And it is not that the theory can be used to make predictions while the data recorder cannot: a sufficiently intelligent agent could use the data recorder to make all the same predictions as it could using the theory.

Perhaps you say that knowledge is rightly defined relative to a particular recipient, so the instruction manual contains knowledge for us since we are intelligent enough to decode it, but the data recorder does not, since we are not intelligent enough to decode it. But firstly we probably are intelligent enough to decode the data recorder and use it to work out how to build spaceships given enough time, and secondly are you really saying that there is no such thing as objective knowledge? That there is no objective difference between a book containing a painstakingly accurate account of a particular battle, and another book of carelessly assembled just-so stories about the same battle?

Now you may say that knowledge is that which gives us the capacity to achieve our goals despite obstacles, and here I wholeheartedly agree. But this is not an answer to the question, it is a restatement of the question. What is it that gives us the capacity to achieve our goals despite obstacles? The thing we intuitively call knowledge seems to be a key ingredient, and in humans, knowledge seems to be some kind of organization and compression of evidence into a form that is useful for planning with respect to a variety of goals. And you might say, well, there just isn’t any more to say than that. Perhaps agents input observations at one end, and output actions at the other end, and that what happens in between follows no fundamental rhyme or reason, is entirely a matter of what works. Well, Eliezer has written about a time when he believed this about AI, too, until seeing that probability theory constrains mind design space in a way that is not merely a set of engineering tricks that "just work". But probability theory does not concretely constrain mind-design space. It is not generally feasible to take a physical device containing sensors and actuators and ask whether or to what extent its internal belief-formation or planning capacities are congruent with the laws of probability theory. Probability theory isn’t that kind of theory. At the level of engineering, it merely suggests certain designs. It is not the kind of theory that lets us take arbitrary minds and understand how they work, not in the way that the theory of electromagnetism allows us to take arbitrary circuits and understand how they work.

What we are seeking is a general understanding of the physical phenomenon of the collection and organization of evidence into a form that is conducive to planning. Most importantly, we are seeking a characterization of the patterns themselves that are produced by evidence-collecting, evidence-organizing entities, and are later used to exert flexible influence over the future. Could it really be that there is nothing general to say about such patterns? That knowledge itself is entirely a chimera? That it’s just a bunch of engineering hacks all the way down and there is no real sense in which we come to know things about the world, except as measured by our capacity to accomplish tasks? That there is no true art of epistemic rationality, only of instrumental rationality? That having true beliefs has no basis in physical reality?

I do not believe that the resolution to this question is a correspondence between internal and external states, because although there certainly are correspondences between internal and external states, such correspondences are far too common to account for what it means to have true beliefs, or to characterize the physical accumulation of knowledge.

But neither do I believe that there is nothing more to say about knowledge as a physical phenomenon.

It is a lot of fun to share this journey with you.

New Comment
17 comments, sorted by Click to highlight new comments since: Today at 11:02 AM

It seems worth distinguishing two propositions:

  1. "Knowledge is the existence of a correspondence between map and territory; the nature of this correspondence has no bearing on whether it constitutes knowledge."
  2. "Knowledge is the existence of a certain kind of correspondence between map and territory; the nature of this correspondence is important, and determines whether it constitutes knowledge, and how much, and what of."

Your observation that there are way too many possible correspondences suffices to refute #1. I am not convinced that you've offered much reason to reject #2, though of course #2 is of little use without some clarity as to what correspondences constitute knowledge and why. And I'm pretty sure that when people say things like "knowledge is map-territory correspondence" they mean something much more like #2 than like #1.

You've looked at some particular versions of #2 and found that they don't work perfectly. It may be the case that no one has found a concrete version of #2 that does work; I am not familiar enough with the literature to say. But if you're claiming that #2 is false (which "it just does not seem tenable that knowledge consists of a correspondence between map and territory" seems to me to imply), that seems to me to go too far.

#1 reminds me of a famous argument of Hilary Putnam's against computational functionalism (i.e., the theory that being a particular mind amounts to executing a particular sort of computation) -- literally any physical object has a (stupid) mapping onto literally any Turing machine. I don't think this argument is generally regarded as successful, though here too I'm not sure anyone has an actual concrete proposal for exactly what sort of correspondence between mental states and computations is good enough. In any case, the philosophical literature on this stuff might be relevant even though it isn't directly addressing your question.

Some thoughts on #2.

  • Hitherto, arguably the only instances of "knowledge" as such have been (in) human minds. It is possible that "knowledge" is a useful term when applied specifically to humans (and might in that case be defined in terms of whatever specific mechanisms of map/territory correspondence our brains use) but that asking "does X know Y?" or "is X accumulating knowledge about Y?" is not a well-defined question if we allow X to be a machine intelligence, an alien, an archangel, etc.
    • It might happen that, if dealing with some particular class of thing very unlike human beings, the most effective abstractions in this area don't include anything that quite corresponds to "knowledge". (I don't have in mind a particular way this would come about.)
  • It seems to me that the specific objections you've raised leave it entirely possible that some definition along the following lines -- which is merely a combination of the notions you've said knowledge isn't "just" -- could work well:
    • Given
      • an agent X
      • some part or aspect or possession Y of X
      • some part or aspect of the world Z,
    • we say that "X is accumulating knowledge of Z in Y" when the following things are true.
      • There is increasing mutual information (or something like mutual information; I'm not sure that mutual information specifically is the exact right notion here) between Y and Z.
      • In "most" situations, X's utility increases as that mutual information does. (That is: in a given situation S that X could be in, for any t let U(t) be the average of X's utility over possible futures of S in which, shortly afterward, the mutual information is t; then "on the whole" U(t) is an increasing function. "On the whole" means something like "as we allow S to vary, the probability that this is so is close to 1".)
  • To be clear, the above still leaves lots of important details unspecified, and I don't know how to specify them. (What exactly do we count as an agent? Not all agenty things exactly have utility functions; what do we mean by "X's utility"? Is it mutual information we want, or conditional entropy, or absolute mutual information, or what? What probability distributions are we using for these things? How do we cash out "on the whole"? What counts as "shortly afterward"? Etc.)
    • But I think these fuzzinesses correspond to genuine fuzziness in the concept of "knowledge". We don't have a single perfectly well defined notion of "knowledge", and I don't see any reason why we should expect that there is a single One True Notion out there. If any version of the above is workable, then probably many versions are, and probably many match about equally well with our intuitive idea of "knowledge" and provide about equal insight.
    • E.g., one of your counterexamples concerned a computer system that accumulates information (say, images taken by a camera) but doesn't do anything with that information. Suppose now that the computer system does do something with the images, but it's something rather simple-minded, and imagine gradually making it more sophisticated and making it use the data in a cleverer way. I suggest that as this happens, we should become more willing to describe the situation as involving "knowledge", to pretty much the same extent as we become more willing to think of the computer system as an agent with something like utilities that increase as it gathers data. But different people might disagree about whether to say, in a given case, "nah, there's no actual knowledge there" or not.
  • In one of your posts, you say something like "we mustn't allow ourselves to treat notions like agent or mind as ontologically basic". I agree, but I think it's perfectly OK to treat some such notions as prerequisites for a definition of "knowledge". You don't want that merely-information-accumulating system to count as accumulating "knowledge", I think precisely because it isn't agenty enough, or isn't conscious enough, or something. But if you demand that for something to count as an account of knowledge it needs to include an account of what it is to be an agent, or what it is to be conscious, then of course you are going to have trouble finding an acceptable account of knowledge; I don't think this is really a difficulty with the notion of knowledge as such.
  • It might turn out that what we want in general is a set of mutually-dependent definitions: we don't define "agent" and then define "knowledge" in terms of "agent", nor vice versa, but we say that a notion K of knowledge and a notion A of agency (and a notion ... of ..., and etc. etc.) are satisfactory if they fit together in the right sort of way. Of course I have no concrete proposal for what the right sort of way is, but it seems worth being aware that this sort of thing might happen. In that case we might be able to derive reasonable notions of knowledge, agency, etc., by starting with crude versions, "substituting them in", and iterating.

Reminds me of a discussion I once had about what is 'artificial'. After all we find a lot of things in nature that are constructed by non-humans. We settled on whether something is based on a representation of a future state. The future part is what is rarely found in nature. Most evolved processes are responsive. Only with nervous systems do you get learning and anticipation of states and then representation of that. I think it is the same here. Knowledge only counts if it entangled with future states. 

Gunnar- yes I think this is true, but it's really surprisingly difficult to operationalize this. Here is how I think this plays out:

Suppose that we are recording videos of some meerkats running around in a certain area. One might think that the raw video data is not very predictive of the future, but that if we used the video data to infer the position and velocity of each meerkat, then we could predict the future position of the meerkats, which would indicate an increase in knowledge compared to just storing the raw data. And I do think that this is what knowledge means, but if we try to operationalize this "predictive" quality in terms of a correspondence between the present configuration of our computer and the future configuration of the meerkats then the raw data will actually have higher mutual information with future configurations than the position-and-velocity representation will.

It is difficult to formalize. But I think your example is off. I didn't mean that we could infer from the video the positions of meerkats now. I meant that the representation encodes or correlates with the future positions (at least more than with the current positions). It is as if there was a video that showed the future meerkats' movements. That would be surprising to find in nature. 

This post is a little too extreme.


But a textbook on its own is not capable of making predictions.

If a textbook says 'if you roll a die (see Figure 8) following this procedure (see Figures 9), it has a 1/6 chance of coming up a 6, 5, 4, etc.' that is a prediction'.

But then if...[then] is it really the case that any ordinary rock contains just as much knowledge as a chemistry textbook?

a) There's readability. (Which is a property of an observer.)

b) The premise seems unlikely. What can a calcium rock teach you about uranium?

and that what happens in between follows no fundamental rhyme or reason, is entirely a matter of what works.

This may be right in the sense that, 'knowledge' need not follow such a 'rhyme or reason'.

probability theory constrains mind design space in a way that is not merely a set of engineering tricks that "just work".

But 'engineering tricks that just work' may arrive at operating in a similar fashion. Evolution might not quite be well described as 'a process of trial and error', but 'engineering tricks that just work (sometimes)' kind of describes us.

What we are seeking is a general understanding of the physical phenomenon of the collection and organization of evidence into a form that is conducive to planning.

People are notable for planning and acting. (You might find studying animals which act useful as well, especially because they are (or might be) less complex, and easier to understand.)

Ways of getting things done

seem to necessarily have to have a correspondence in order to succeed. However, learning seems to mess with the idea of a static correspondence in much the same way as a dynamic (changing) world (which means that past correspondence can slip as time goes forward). Someone can start doing something having the wrong idea, figure it out along the way, and fix the plan, and successfully achieve what they were trying to do - despite starting out with the wrong idea - if they learn. (But this definition of learning might be circular.)

One kind of knowledge that seems relevant, is knowledge which is broadly applicable.

I would prefer to say that a textbook doesn't make predictions. It may encode some information in a way that allows an agent to make a prediction. I'm open to that being semantic hair-splitting, but I think there is a useful distinction to be made between "making predictions" (as an action taken by an agent), and "having some representation of a prediction encoded in it" (a possibly static property that depends upon interpretation).

But then, that just pushes the distinction back a little: what is an agent? Per common usage, it is something that can "decide to act". In this context we presumably also want to extend this to entities that can only "act" in the sense of accepting or rejecting beliefs (such as the favourite "brain in a jar").

I think one distinguishing property we might ascribe even to the brain-in-a-jar is the likelihood that its decisions could affect the rest of the world in the gross material way we're accustomed to thinking about. Even one neuron of input or output being "hooked up" could suffice in principle. It's a lot harder to see how the internal states of a lump of rock could be "hooked up" in any corresponding manner without essentially subsuming it into something that we already think of as an agent.

Response broken up by paragraphs:


1)

If I write "The sun will explode in the year 5 billion AD" on a rock, the

possibly static property that depends upon interpretation

is that it says "The sun will explode in the year 5 billion AD", and the 'dependency on interpretation' is 'the ability to read English.

a textbook doesn't make predictions.

'Technically true' in that it may encode a record of past predictions by agents in addition to

encod[ing] some information in a way that allows an agent to make a prediction.

2)

Give the brain a voice, a body, or hook it up to sensors that detect what it thinks. The last option may not be what we think of as control, and yet (given further, feedback, visual or otherwise), one (such as a brain, in theory) may learn to control things.


3)

It's a lot harder to see how the internal states of a lump of rock could be "hooked up" in any corresponding manner without essentially subsuming it into something that we already think of as an agent.

Break it up, extract those rare earth metals, make a computer. Is it an agent now?

Isn't the most important feature of an "internal map" that it is a conceptual and subjective thing, and not a physical thing? Obviously this smacks of dualism, but that's the price we pay for being able to communicate at all.

Any part of reality very likely corresponds with other parts of reality (to the extent that it makes sense to divide reality into parts), but that doesn't imbue them with knowledge, because they're the wrong abstraction level of thing to be maps and so their correspondence doesn't count.

Like any other abstraction, internal maps are fuzzy around the edges. We know of some things that we definitely call maps (in the sense of "correspondence between map and territory"), such as hand-waving at some aspects of whatever commonalities there are between humans when thinking about the world. This concept also seems to be applicable to behaviours of some other animals. We often ascribe internal maps to behaviour of some computer programs too. We ascribe maps to lots of classes of imaginary things. It seems that we do this for most things where we can identify some sensory input, some sort of information store, and some repertoire of actions that appear to be based on both.

We can talk metaphorically about physical objects conveying knowledge, with the help of some implied maps. For example, we may have some set of agents in mind that we expect to be able to use the physical object to update their maps to better match whatever aspects of territory we have in mind. With some shared reference class we can then talk about which objects are better by various metrics at conveying knowledge.

I do think it is is true that in principle "there is no objective difference between a book containing a painstakingly accurate account of a particular battle, and another book of carelessly assembled just-so stories about the same battle" (emphasis mine). With sufficiently bizarre coincidence of contexts, they could even be objectively identical objects. We can in practice say that in some expected class of agents (say, people from the writer's culture who are capable of reading) interacting in expected ways (like reading it instead of burning it for heat), the former will almost certainly convey more knowledge about the battle than the latter.

Isn't the most important feature of an "internal map" that it is a conceptual and subjective thing, and not a physical thing? Obviously this smacks of dualism, but that's the price we pay for being able to communicate at all.

And yet such an "internal" thing must have some manifestation embedded within the physical world. However it is often a useful abstraction to ignore the physical details of how information is created and stored.

I do think it is is true that in principle "there is no objective difference between a book containing a painstakingly accurate account of a particular battle, and another book of carelessly assembled just-so stories about the same battle" (emphasis mine). With sufficiently bizarre coincidence of contexts, they could even be objectively identical objects. We can in practice say that in some expected class of agents (say, people from the writer's culture who are capable of reading) interacting in expected ways (like reading it instead of burning it for heat), the former will almost certainly convey more knowledge about the battle than the latter.

I think this begs the question of just what knowledge is.

I don't think in the context of this discussion that it does beg the question.

The point I was discussing was whether we really mean the same thing by "knowledge in a book" and "knowledge known by an agent". My argument is that the phrase "knowledge in a book" is just a notational shorthand for "knowledge some implied agents can be expected to gain from it".

If this is a reasonable position, then "knowledge in an object" is not a property of the object itself. Examining how it is present there is making a probably erroneous assumption that it is there to be found at all.

The question does remain about how knowledge is represented in agents, and I think that is the much more interesting and fruitful meat of the question.

  "Most importantly, we are seeking a characterization of the patterns themselves that are produced by evidence-collecting, evidence-organizing entities, and are later used to exert flexible influence over the future."

This would be very nice, but may turn out to be as difficult as seeking an explanation for how cats purr in terms of quantum field theory. It's quite possible that there are a lot of abstraction layers in between physical patterns and agentive behaviour.

agentive behaviour.

Is a cat an agent?

A living, otherwise fairly typical cat? Absolutely yes, and not even near a boundary for the concept. Dead cat? I wouldn't say so.

As I see it, the term "agent" is very much broader than "sentient". It covers pretty much anything capable of taking different actions based on internal information processing and external senses. Essentially all living things are agents, and plenty of nonliving things.

So the bacteria within a dead cat (or even a living one) would qualify, but I don't think you could reasonably ascribe "actions based on internal information" to a dead cat as any sort of single entity.

It seems to me that examples in the fuzzy boundaries are more like simple thermostats and viruses than cats.

'Actions based on internal information' seems as descriptive of bacteria, as it does of viruses. Are they usually less complex, or something?

Viruses are generally very much simpler than bacteria, yes.

My possibly flawed understanding is that most viruses don't really do anything at all by themselves. Once they encounter cells with the right receptors, they get ingested and (again only for the right types of cell) the internal machinery processes them in a way that makes more viruses.

I suppose you could think of that as "sensing" cells and "acting" to get inside and hijack them, but it's a bit of a stretch and why I'm not sure that they should be included. From an information-processing point of view, I think of them more like passive info-hazards than active agents.

In principle, if something evolves, then I think it's worth noticing. Also, recent events have shown just how impactful viruses can be. Which is interesting given how little they seem to do of:

'collecting and organizing evidence to exert flexible influence over the future'

I think it's fair to characterize them as 'largely exploiting static features in the world' - alas, we tend to create/are such things. And given our massive global success, things able to exploit what (weaknesses) we have in common can become quite formidable. For all our 'immense' differences, we aren't so different after all.*

*Though I probably should look into the impacts of cultural variation.

Yes, I would have much less hesitation in viewing a virus species as a multi-bodied agent with evolution as a driving algorithm than a single virion as an agent.