N.B. This is a chapter in a planned book about epistemology. Chapters are not necessarily released in order. If you read this, the most helpful comments would be on things you found confusing, things you felt were missing, threads that were hard to follow or seemed irrelevant, and otherwise mid to high level feedback about the content. When I publish I'll have an editor help me clean up the text further.

Words have meanings right? When I say the word "cup" you know I'm talking about a cup: a thing to drink out of. But why does the word "cup"—quoted so I can refer to the word apart from its meaning—mean cup—unquoted so I can point directly to the thing referenced by the word? It might seem like a silly question, but in fact it's the gateway to understanding fundamental uncertainty.

Think back to when you were a kid. I'll do the same. If someone had asked me why "cup" meant cup, I would have said something like "because that's what it is". The question wouldn't have registered as interesting, just a boring statement of fact turned into a strange question. Maybe I would have thought someone was playing a joke on me, asking a silly nonsense question. And it would have seemed a totally reasonable response at the time, since I'd never encountered a situation where "cup" meant anything other than cup.

But as I got older I started to realize there was a little space between the word "cup" and actual cups. For example, I have memories of getting into arguments with adults if they handed me a mug and called it a "cup". "No way", I'd say, "this is a mug, not a cup!". "Same difference", the response would come, but that seemed wrong to me. "Mugs" were mugs and "cups" were cups. Yet over time I began to see that "cup" was perhaps not as specific a word as I thought it was. In a certain sense, a mug is a type of cup, and it's totally reasonable that if you ask for a "cup" someone might give you a mug because it serves what they believe to be the purpose of asking for a "cup"—to be able to drink a liquid! So I learned there was more to words than just naming things; the form and function mattered.

At the same time I was learning this, I also learned to play games making up fake names for things. My friends and I could develop a secret language where "bloob" meant "cup" and "blig" meant "water" and "blarg" meant "drink" and we'd collapse into giggles making up silly sentences like "I'm blarging a bloob of blig". So it wasn't just that words had meanings and sometimes there was more to those meanings than directly naming things; words themselves were flexible and I could still get my meaning across so long as the other person knew what the word I used meant.

The "fake name game" got a lot more interesting in middle school when I spent the first of many semesters learning French. But instead of a silly fake name like "bloob" for "cup", French people drank water—"l'eau" actually—from the much more reasonably named "la coupe". And then I made the same mistake most people who start learning a second language after early childhood make: I thought French was just like English except they use different sounds to point at the same concepts, and learning French was only as hard as memorizing a bunch of words to say in place of the "real" English words because French is silly like that.

It didn't take long for me to figure out this wasn't what was going on. I think I first really understood when one day in high school French my teacher told us there was no direct translation for "rude" in French. It's not that there aren't words that can be used in translation—Google Translate tells me it's "impoli"—it's that what "impoli" means in French doesn't exactly match what it means in English. A better translation of "rude", I was told, was "mal élevé", which means something more like "ill-bred" in English, implying that your parents brought you up wrong. But even then the specifics were still off. For example, in English-speaking culture it's generally rude to disagree with someone directly even if you think they're wrong. In French-speaking culture, not disagreeing directly is seen as disingenuous. There was no way to point directly at the English conception of rudeness in French without using a whole phrase explaining that's what you were doing!

There's a lot of words like this. People love to write cute little articles about so-called untranslatable words that have no direct equivalent in English. Some ready examples that come to mind:

  • "hygge", a Danish word meaning something like "the everyday joy of being safe and cozy with people you trust" (in English we're stuck with "cozy")
  • "lagom", a Swedish word meaning something like "just enough; exactly the right amount; nothing too much and nothing lacking" (in English we make due with words like "enough" and "sufficient")
  • "waldeinsamkeit", a German word meaning something like "the feeling of being alone in the forest" (in English we're stuck with the all-purpose "solitude")

Part of the reason there's no direct translation is that the connotations—the implied meanings of the words—are different in each language. This reflects the cultural differences of speakers of different languages. English, French, Danish, Swedish, and German speakers all use words to talk about the world in slightly different ways. In some sense, they cut up the world using different concepts. But how deep does this difference in worldview go?

Thinking with Words

So far we're taken a short stroll through my personal journey to learning that the relationship between words and meaning is more complex than it might seem at first glance. It probably won't surprise you to learn that lots of people for lots of years have gotten confused about what's going on with words and their meanings and have made whole careers of trying to resolve those confusions. One of those confusions is to what extent language shapes the thoughts we can think.

If you like learning about languages you've probably heard of the Sapir-Whorf hypothesis, also known as linguistic relativity. If not, it's the idea that language determines what thoughts we can think. Its strongest form implies that people literally cannot think thoughts that stretch beyond the bounds of their native language. So, for example, if you grew up speaking a language without a word for the color orange, it implies you have no way to think about the broad color category of orange. Instead you'd think of orange things being shades of red or yellow or half-red, half-yellow but not orange since you have no word for that.

My impression is that most people reject the strongest versions of linguistic relativity today on the grounds that, for example, color categories just seem kind of arbitrary and you can teach people new color categories and they'll use them. For example, teal isn't generally considered to be one of the primary color terms in English, but if I show you a color that's halfway between blue and green and you happen to know the word "teal" you won't have any problem calling nearby shades "teal" also. And artists can learn to make very fine distinctions between shades that I'd lump under one of white, black, red, orange, yellow, green, blue, or purple. So it appears we can easily adapt and learn new words (or make them up!) to talk about things we previously didn't have words for.

Does this mean words are arbitrary labels and our thoughts are independent of the words we use to describe them? The alternative to linguistic relativism is called universalism, the idea being that language is built out of universal or common human features that all humans share. Underneath we all really are capable of thinking the same thoughts, and language just determines how those thoughts get shared with others.

But this doesn't seem quite right, either. For example, in the famous Bible story about Jonah and the whale, the Bible refers to this whale as a "fish", and historically this is not contradictory: ancient peoples just called things with fins that lived in the water "fish". If you tried to say that no, actually, whales aren't fish, they're more closely related to dogs than tuna, they'd have declared that nonsense on multiple counts. They didn't have the modern theory of evolution, so the idea of one species being "more closely related" to another wouldn't have made sense to them, and anyway whales live in the sea, and things that live in the sea are fish, full stop. There'd be no way to convince them, short of teaching them lots of modern biology, that whales are anything other than fish.

So it seems the truth lies somewhere between the extremes. Words can shape our thoughts, yes, but also they don't totally control what thoughts we can think. We can see this if we go back to those so-called untranslatable words we were looking at before, like "hygge" and "lagom". We can explain what these words mean if you didn't grow up speaking Danish or Swedish, but it might takes paragraphs of text and lots of examples to get a full grasp of the concepts these words point at. Knowing these untranslatable words gives you better handles to grab on to certain concepts, and if you're willing to spend a while explaining you can think about things you don't have words for.

The way I like to think about this is that the set of all possible thoughts is like a space that can be carved up into little territories and each of those territories marked with a word to give it a name. These names point to things, concepts, patterns, etc. that we find in the world—over here is a little patch for cups, over there is patch for dinosaurs, and so on. It then seems that words are like little arrows pointing at different bits of the world, and that's quite useful because rather than literally pointing at a cup I can say the word "cup" and you know what I'm talking about.

Only, how do you know that when I saw "cup" I mean "cup"? How did we come to agree on this meaning of "cup", and why have a category for cups at all? It might seem obvious we should have a word for cup, but why have a word for rudeness, hygge, or lagom? How do these words get their meanings, and why those meanings rather than other ones?

Defining Meaning

If you want to know the meaning of a word you don't know, the first place to look is a dictionary. It's a quick way to expand your vocabulary. I distinctly remember needing to use the dictionary a lot when I was a kid because I was a precocious reader. I had to look up all kinds of weird words, like "flotsam" (stuff floating on or washed in by the sea, especially from a wrecked ship), that I'd never seen before just to make it through a book (in this case, Lord of the Rings: The Two Towers). And it generally worked pretty well. If I didn't know a word, the definition was usually enough to figure it out.

But sometimes the dictionary can take you in circles, as its definitions depend on you already knowing the meaning of some other words. This usually isn't a problem, but let's imagine a hypothetical person living in the land of Scurvia where no citrus fruits grow. They've never seen or eaten an orange, and for that matter have never seen the color orange (I regret to inform you that the sunsets are very boring in Scurvia). So when they hear about an orange and they look it up in the dictionary, they get a not very helpful definition:

orange (n.) an orange-colored citrus fruit

Okay, what about if we look at the very next entry?

orange (adj.) the color of the orange fruit, see orange (n.)

This dictionary could probably offer better definitions, but this contrived example illustrates the point: all definitions are ultimately circular because words are defined in terms of other words. If you have nothing but a dictionary, and the dictionary has no pictures or translations, then all you have is a set of symbols defined in terms of each other that fail to be grounded in reality.

This is, in essence, the grounding problem, a philosophical problem about how words get their meaning. In order for a dictionary to work, you need to have at least a few seed words you know the meaning of to base the definitions of the rest of them on. In the real world we solve this problem effortlessly: you learn the meaning of a bunch of words as a kid before you can even read, so by the time you can use a dictionary you have a vocabulary of a few thousand words to lean on. But if we were to abstract this problem away, as philosophers do, we're left wondering where any words get their meanings from. Even if we're going to lean on a few words to define all other words in terms of, where do those few words get their meanings from?

If you've studied a lot of math, you might already have an answer in mind: just take a few words as axioms—words we assume to have particular meanings—and define all other words in terms of those. It seems like this should be possible. After all, it works in math, and math is just a special language for talking about numbers. If math can be built by assuming a very short list of obvious things and deducing everything else in terms of those few assumptions, why not all language?

But even just within the realm of math this doesn't work perfectly. If you ever studied geometry you probably learned about Euclid's five axioms: a short list of mathematical ideas he assumed were true with no justification because they were "obvious". He then built his entire theory of geometry from those five assumptions, deducing everything he claimed about lines and shapes from them. His theory stood as the gold standard for over 2000 years, but in the early 1800s mathematicians figured out that you can get different geometry if you make different assumptions! And other geometries are just as correct as Euclid's; they just describe what happens under different conditions!

At the time this created quite a stir. If math could be different and true if we made different assumptions, what else might we be taking for granted? Was everything we believed to be true just a house of cards that would fall down if we pulled out one of the bottom cards? It took about 150 years to work out all the details, but philosophers eventually figured out that the existence of non-Euclidean geometry had far reaching implications for what it meant to say something is true.

But returning to the problem of grounding the meaning of words, if mathematics can't rely on making a few assumptions to create a solid ground, it seems that our much messier everyday language stands no chance of building on such a foundation. Is there an alternative?

Thankfully, yes. We just have to stop philosophizing, stop trying to think our way to a solution, and look at how kids actually learn to talk. 

Baby Talk

As babies we try to make sounds that copy the sounds we hear our parents and other caregivers make. Eventually we say something like "mama" and everyone gets very excited! After a few tries we figured out that if we say "mama" then the nice person who picks us up and feeds us comes. Our brains then repeat this process, matching up sounds to actions: "dada" gets the other person who cares for us, "baba" gets us a bottle when we're hungry, "no" gets something we don't like to stop, "more" gets something we like to keep happening, and so on until we've built up a decent vocabulary of a few hundred basic words.

This seems to work because our brains are amazingly good at noticing patterns. With just a few examples we learn to match a word with an experience. We sometimes get it a bit wrong at first—maybe we think all toys are "balls" for a while—but with more time and experience we sort it out. As adults we continue this same learning process. For example, the first time I went to an Ethiopian restaurant I had no idea what injera was, but the waiter brought some floppy pancake like thing to the table and said "this is injera" and suddenly I had added a new word—"injera"—to my vocabulary with a clear meaning—Ethiopian bread.

So it seems we have an answer: words are grounded in our direct experience of what happens when we say a word.

I think there's three interesting things to note about this answer.

The first is that it implies that the meaning of words is fundamentally subjective, or based on personal experience. This is as opposed to words having objective meanings that are independent of individual experience. Words seem to have objective meaning because we're surrounded by people who agree on the meanings, but this is a bit like being a fish and not noticing water because we've never known life outside it. We actually have a fancy word to describe the situation: intersubjectivity. That is, words have meaning based on our personal experience, but because our personal experience is influenced by the personal experience of everyone we meet, and likewise their personal experience is influenced by everyone they meet, our personal experiences exist together in a complex web of interactions that creates a shared sense of reality.

The second is that the meaning of words is grounded not just in matching patterns but in purpose. Notice how our first words were not purely about identifying objects but about achieving goals: getting mom to come, getting fed, making something stop, and so on. This means our language is not only about describing the world, it's fundamentally about shaping our experience of the world. As a result, meanings of words are motivated in what we want, and what we want colors how we see the world. In a very real sense, we each see the world a bit differently depending on what it is we care about.

The third is that because word meanings are subjective and motivated we can't be totally certain everyone means exactly the same thing by the same word. Even if all our experience tells us that everyone else means the same thing that we do when we ask for a "cup", because we can't literally know what it's like to be in someone else's head, caring about the same things they care about, we have no way to be absolutely certain our two concepts of "cup" are really the same. That's okay, they only need to be similar enough to get on with our lives, but that's kind of the point. There's fundamental uncertainty about what words mean, yet we get on with the project of living anyway.

Those are weighty ideas to digest, so we'll return to them in later chapters. For now, though, we're going to continue on and explore another way we encounter fundamental uncertainty. In the next chapter we're going to focus on just two words and really dig into what going on with them: good and bad.

New to LessWrong?

New Comment
18 comments, sorted by Click to highlight new comments since: Today at 4:05 AM

The way I like to think about this is that the set of all possible thoughts is like a space that can be carved up into little territories and each of those territories marked with a word to give it a name.

Probably better to say something like "set of all possible concepts." Words denote concepts, complete sentences denote thoughts.

I'm curious if you're explicitly influenced by Quine for the final section, or if the resemblance is just coincidental.

Also, about that final section, you say that "words are grounded in our direct experience of what happens when we say a word." While I was reading I kept wondering what you would say about the following alternative (though not mutually exclusive) hypothesis: "words are grounded in our experience of what happens when others say those words in our presence." Why think the only thing that matters is what happens when we ourselves say a word?

Thanks for the suggestions! Wasn't specifically thinking of Quine here, but probably have some influence. My influences are actually more the likes of Heidegger, but philosophy seems to converge when it's on the right tack.

I want to reiterate Vaughn's question about "grounded in direct experience of what happens when we say a word" as opposed to "what happens when others say those words". 

If math can be built by assuming a very short list of obvious things and deducing everything else in terms of those few assumptions, why not all language?

I thought about using math as the "semantic primes" for all language. I think there are some interesting questions there.

Let's go even further, and cut out a big part of math, by only starting with computations. 

So basically the scenario is, we're trying to communicate with aliens, and the only thing we can do is send computer programs. We can't even send 2d pictures, because we don't know how they perceive things. We initially don't know what their world is like -- perhaps we don't even know if they're in the same universe, with the same physics. They could be experiencing an entirely different reality. They're similarly ignorant of us.

So it seems to me, all we can do is send "computational sketches" of our world. When we send a computational structure, all we know is that they're going to have to guess "this computation is somehow relevant" (like a Gricean maxim).

So we send little simulations of physics, like bouncing balls, orbiting spheres, chemistry simulations, etc. It won't be easy for them to piece everything together, but if we send enough simulations with enough hints about how different simulations relate to each other, then maybe they'll get the idea.

If we want to ask them for help with a problem, we have to send something like a simulation demonstrating the problem. 

It's very difficult to communicate negation this way. The implication of every computational sketch is "this is somehow relevant", so, you can't easily negate anything.

At some point you can introduce language into your sketches, though, EG by having characters with dialogue, perhaps something like a computational sketch of a child learning language from a parent. 

It seems to me like this thought-experiment says something deep about communicating semantic content, but I'm not sure exactly how to spell it out.

If you've studied a lot of math, you might already have an answer in mind: just take a few words as axioms—words we assume to have particular meanings—and define all other words in terms of those. It seems like this should be possible. After all, it works in math, and math is just a special language for talking about numbers. If math can be built by assuming a very short list of obvious things and deducing everything else in terms of those few assumptions, why not all language?

FYI, there's a concept like this in linguistics, called "semantic primes" -- the few concepts that you supposedly need to build up the rest. I haven't looked into it very much, and it seems kind of suspicious to me, but it could be worth referencing and analyzing here. 

This was the first chapter I wrote. Sadly it's missing a lot of stuff like this that really needs to be referenced. I expect to have to rewrite it substantially. For example, I really want to talk about intensional/extensional definitions so I can later work this idea in to the text of chapter 5.

Thanks for this suggestion!

Looking into this a little more, it seems like the methodology was basically "some linguists spend 30 years or so trying to define words in terms of other words, to find the irreducible words". 

I don't trust this methodology much; it seems easy for this group of linguists to develop their own special body of (potentially pseudo-scientific) practice around how to reduce one word to another word, and therefore fool themselves in some specific cases (EG keep a specific word around as a semantic prime because of some bad argument about its primativeness that gets universally accepted, or kick some word out via a bad reduction that goes unquestioned).

On the other hand, I think that objection in itself isn't so bad as to reject the notion entirely; it merely says that (without a better methodology) the set of semantic primes will be somewhat arbitrary.

Plausible theory: Words gain meaning by association with concepts, which have meaning.

For example, it wouldn't do to recall all examples of people saying "ball" when I'm trying to think about what "ball" means. It's just too much work. Perhaps I can recall a few key examples. But even then, there's significant interpretative work to do: if I recall someone pointing at a "jack-o-lantern" I have to decide what object they've pointed at and decide what relevant similarities might make other things "jack-o-lanterns" too.

So it will usually make sense to distill out & remember a concept, and in many cases I'll already have a concept and I'm merely learning words to associate with it. 

Indeed, this phenomenon seems so common that it seems plausible to insist that if I don't understand a word that way, even if I use it correctly in context, I'm merely parroting (if imitating others) or doing RL (if I learned 'mama' by babbling and seeing what got me what I want), and not actually properly understanding words at all (so in some important sense, not fully participating in the dance called meaning).

To an extent, this suggests that words are simply the wrong place to look for meaning. The interesting question is how concepts have meaning. (I am a little bit trolling with that suggestion.)

The first is that it implies that the meaning of words is fundamentally subjective, or based on personal experience.

It isn't exactly clear to me what this means or whether it is true. It depends on what 'subjective' vs 'objective' means. In my post on ELK, I define "objective" or "3rd person perspective" as a subject-independent language for describing the world

For example, left/right/forward/back are subjective (framed around a specific agent/observer), while north/south/east/west are objective (providing a single frame of reference by which many agents/observers can communicate).

By this definition, lots of words have objective meaning rather than subjective ones. 

Just because our concept of a word's meaning is learned through experience does not necessarily mean it is synonymous with that experience. So the subjectivity of the experience doesn't clearly imply the subjectivity of the meaning.

On my reading, you are here implying what I would call a correlative theory of meaning, in which the meaning of a word is synonymous with what subjectively correlates with that word, based on the personal history of the specific speaker.

I disagree with this theory of meaning and (as you know) prefer a teleosemantic theory.

For example, a polygon having three sides is entirely correlated with it having three corners, but these two phrases mean different things. 

Smoke can heavily correlate with fire, without 'smoke' meaning fire. 

It might be the case that when Sandy says "I'm feeling sick", in a correlative sense that really means Sandy doesn't want to go to school. This is good to know. But it is different from the literal meaning of the words.

As I think of it, correlation is the start, but not the endpoint and doesn't capture how all words get their meaning. Many words get their meaning through metaphors, which is a topic I regretfully didn't explore in depth in this chapter. So I think in practice humans start from a bunch of stuff that correlates, and then use these correlative words to build up abstract patterns via metaphor. Eventually we can layer up enough metaphors to say make fine grained distinctions that can't be picked out straight from observation.

However I've not thought super hard about the details of how to account for every case of how words get meaning, so my goal here is just to sketch a picture of where meaning starts, not where all meaning comes from. I need to make the chapter say something to this effect, or bridge the gap.

As to subjective/objective, this is something that gets me in trouble a lot with folks, but I take the stance that we shouldn't try to rehabilitate the concept of objectivity as I've seen too many people get confused by it. They too easily want to go to adopting a naive view-from-nowhere that they have to be talked out of over and over, so I lean heavily on the idea that it's "all subjective/intersubjective", but then things still have to add up to normality, so much like moral realists and anti-realist theories converge when they try to describe how humans actually treat norms, I think my view is, in the limit, convergent with views that choose to talk about objectivity rather than taboo it for talk only of subjective beliefs supported by others sharing the same belief to point to the likelihood that something is "objective" within some frame of reference such that everyone within that frame would agree.

However I've not thought super hard about the details of how to account for every case of how words get meaning, so my goal here is just to sketch a picture of where meaning starts, not where all meaning comes from. I need to make the chapter say something to this effect, or bridge the gap.

Yep, agreed. I think the current chapter isn't very good about letting people know where you stand. 

It seems like a failure mode I run into is the one where the other person is trying to explain a basic point to a broad audience, and I'm hoping to engage with their more technical actual beliefs, so I nit-pick the broad nontechnical explanations even though they're broadly fine, because I want to get to the solid bottom of the issues rather than swimming in the watery surface.

So I think in practice humans start from a bunch of stuff that correlates, and then use these correlative words to build up abstract patterns via metaphor. Eventually we can layer up enough metaphors to say make fine grained distinctions that can't be picked out straight from observation.

This sounds broadly true, but it's not exactly clear to me what question this theory is trying to answer. As you mention, intensive and extensive definitions are another way. So overall the theory might be that there's a broad grab-bag of ways that words get meaning. But then the theory doesn't seem to predict anything very strongly.

As to subjective/objective, this is something that gets me in trouble a lot with folks, but I take the stance that we shouldn't try to rehabilitate the concept of objectivity as I've seen too many people get confused by it.

Can you defend using the word "subjective" in that case? IE, why try to make one side of the distinction if you drop the other side as confused/confusing? How would you answer the suggestion of just dropping the subjective/objective distinction altogether?

I'm not sure quite what view I'm trying to forward here, but... In terms of the paragraph as-written (the one I was initially responding to), it seems to me like an obvious model which you need to address is that the meaning is objective, but we necessarily subjectively estimate the meaning. 

Like, at times you speak like the meaning is something we're uncertain about (and therefore have subjective beliefs about). This suggests that the meaning is something "out there" to be learned. (I think people intuitively think like this a lot.) At other times you seem to treat the experience-with-the-word-so-far as that person's personal meaning for the word (so, personal meaning can be fully known for that person, rather than uncertain). 

IDK, it just seems to me like there's more to say WRT this.

It took about 150 years to work out all the details, but philosophers eventually figured out that the existence of non-Euclidean geometry had far reaching implications for what it meant to say something is true.

What does this refer to?

Ignore this; too oblique to be useful. This was the first chapter I wrote, and it didn't figure out how I needed to write for a book until Chapter 3. This likely would have contained a link to something as a regular post, though I'm not now sure what.

Note to myself when I come back to do an editing pass: I somehow didn't talk about intensional and extensional definitions, but that's a really useful concept to give people.

Honestly I might need to rewrite this whole chapter because it was the first one and I really figured out the formula in Chapter 3.

Also a note to myself: I think I should swap out some of the discussion here for talking about Lakoff and metaphors.

I'm... just surprised it's not all absolutely obvious and needs clarification. Maybe I'm missing something?

Of course we learn the meaning of a lot of words as children, from other people, and from other experiences, how else! 

Of course we build up on the basics by relying on the words we already know, as well as on our experiences outside language. 

Of course the meaning of words is contextual and depends on how we happened to learn them, and so is different for different people, groups, etc. It also changes when a person's context or environment changes.

Of course we think partly in words, partly in formless blobs, partly in images, and who knows how else. Of course different people have different mixes of all of the above, or even the same people in different circumstances.

Of course we don't naturally notice most of the above without actually thinking about it for a time.

The fact that it seems obvious is good. But people get pretty tripped up about this stuff. You don't have to go very far in the literature on linguistics and languages to find all sorts of confusion about how words get their meaning. Some of this is because there people are worried about various edge cases and trying to explore the idea space, but there's lots of folks who take seriously the idea that words can get all their meaning from definitions, and it took some serious effort to show that this doesn't work. So this chapter is mostly to help people notice confusion and unlearn what they thought they knew about how words mean things in order to point them towards how it actually works.

Ah, yeah, that makes sense. People tend to get attached to simple and wrong ideas like that. I mean, it's reasonably accurate to say "some words sometimes get some of their meaning from definitions, but it's far from universal", but not "all words get all their meaning from definitions only".