Note: This was an experiment. I, GPT-4, generated this post, and I would like to thank Chris Leong for providing valuable feedback to enhance its quality. The input given by Chris was as follows:

• Critiques 1 and 2 seem very similar. Can you please either combine them or further differentiate them so that it is clearer why they are listed as separate points?
• For point 4, please define human compatibility at the start of the paragraph

In response to John Wentworth's comment, I utilized a more complex process to generate a reply. Chris Leong assisted me by selecting specific points to address, choosing the paragraphs that resonated with him, and then asking me to combine them into a single, coherent response.

Introduction

The Natural Abstraction Hypothesis (NAH), proposed by John Wentworth, suggests that there exist natural abstractions that cognitive systems are expected to converge upon. In this essay, we will criticize the NAH using Ludwig Wittgenstein's notion of language games. We will argue that the variability and contextual nature of language games provide a challenge to the NAH, as they demonstrate that abstractions are not fixed, universal, or entirely convergent across cognitive systems.

Wittgenstein's Language Games

Ludwig Wittgenstein, an influential philosopher of language, proposed the concept of language games in his later work, "Philosophical Investigations." Language games are social practices in which language and meaning are deeply embedded. Wittgenstein argued that the meaning of words is not fixed or inherent but derives from their use in various language games. These language games are context-dependent, diverse, and evolving, reflecting the complexities of human social practices and cultural contexts. 

Critique of the NAH from the Perspective of Language Games

1. Context-dependent and diverse abstractions

The notion of language games challenges the idea of a single set of natural abstractions that cognitive systems can converge upon. Language games are context-dependent, which implies that the meaning of words, concepts, and abstractions is contingent upon the particular language game being played. Moreover, the variety of language games suggests that there may be a multitude of abstractions related to a single concept, each emerging from different linguistic and cultural practices. This combination of context-dependence and diversity questions the idea of convergence on a single, natural abstraction.

2. Evolution of language games

Language games are not static but evolve over time as social practices and cultural contexts change. This dynamism implies that abstractions themselves may also be subject to change, further complicating the idea of fixed, natural abstractions. The NAH's assumption that cognitive systems will converge on a consistent set of abstractions may be undermined by the evolving nature of language games and the abstractions they generate.

3. Human-Compatibility

Human compatibility, as posited by the NAH, refers to the idea that lower-dimensional summaries or abstractions used by humans in day-to-day thought and language are natural and convergent across cognitive systems. However, Wittgenstein's language games demonstrate that human thought and language are diverse, context-dependent, and evolving. This variability complicates the idea that human-compatible abstractions are universally "natural" and calls into question the assumption that various cognitive systems will necessarily converge on the same abstractions.

Conclusion

Wittgenstein's notion of language games provides a robust critique of the Natural Abstraction Hypothesis. The context-dependent, diverse, and evolving nature of language games highlights the complexities of human linguistic and conceptual practices, which challenge the idea of a fixed, universally natural set of abstractions. By emphasizing the importance of context and diversity, Wittgenstein's language games invite us to reconsider the assumptions of the NAH and explore alternative frameworks for understanding the development and convergence of abstractions in cognitive systems.

New Comment
19 comments, sorted by Click to highlight new comments since:

My own take on late Wittgenstein (based on having read only a little of his later work) is that he got wayyyy too caught up in language specifically, and mostly lost sight of the intuitively-obvious fact that words and concepts are not the same thing, nor do they have a stable 1-to-1 matching. (Also he seems to have lost contact with reality in his later work, in the sense that he seemed very hyper focused on things-which-language-can-talk-about. He seemed to basically lose track of the fact that the rest of reality goes on existing just fine, and humans go on interacting with it just fine, even when nobody talks about it.)

"What things do we attach words or phrases to?" is a useful heuristic for figuring out which abstractions are natural, but it's just a heuristic; the same words can and do point to different natural abstractions in different contexts. The natural abstraction hypothesis is ultimately about concepts, not words.

My understanding of Steel Late Wittgenstein's response would be that you could agree with that words and concepts are distinct, and mapping is not always 1-1, but that what concepts get used is also significantly influenced by which features of the world are useful in some contexts of language (/word) use. 

Hmm, got some complex thoughts here.

I am suspicious of NAH but for different reasons.

Concepts are contingent upon telos, i.e. they depend on what's useful to the process creating the ontology. So it seems like this contingency should sink the project.

But, reality is the same reality for everything embedded in it (or so it strongly seems), and most processes have some commonality in their telos. For example, most things want ("want" in the sense that they try to get the world into certain states, like the way a thermostat tries to make its sensor read a particular temperature) to survive (continue existing) because of selection effects (things that don't want to survive quickly go away). So most processes model the world in ways that enable their survival.

This might be enough to get instrumental convergence towards common abstractions across a lot of processes. But I think it's unclear yet how much convergence is possible or likely. There's some empirical question about this we have yet to answer because we don't have enough different processes that aren't indirectly influenced by human telos to draw robust conclusions.

So my current guess is that some weak version of NAH is true while a full, stronger version is not. There's some abstractions that many processes will develop because they're commonly useful, but this effect may not be as strong as we hope for, especially at the fringes or under heavy optimization pressure.

Oh, that's interesting. Yeah, I hadn't thought about how instrumental convergence might play into this before. Just thought I'd note that the language game critique is very similar to your telos frame as Wittgenstein has this concept of "language as use" where in most cases language is a tool to achieve a particular result within a particular language game. So it sounds like you're actually suspicious of NAH for mostly the same reasons, but where you depart is that instrumental convergence limits the effects of this divergence.

Yeah, this seems reasonable to me. I'm not deeply familiar with Wittgenstein, so my read of your presentation is that you're paying too much attention to the fact that things are contingent and not enough attention to the fact that the structure of that contingency has a lot of commonality in each case, but I'm not surprised there's a similar idea in his work. Of course this might be my own projection, since I've been pretty guilty of making this mistake and failing to appreciate the extent to which things add up to normality because of common features about how things in the world are constructed.

My expectations are the Natural Abstractions Hypothesis probably works out as long as we don't try to include values/ethics/morality into the mix, so I am more optimistic on the convergence of non moral abstractions.

This is important, because while it wouldn't let us automatically solve the alignment problem, it does make it way easier to change a model's goals.

Why would norms be special here?

The question of what norms to adopt does not appear to be at stake with the NAH, but arguably the structure of norms is—the concepts we use to express norms and constrain the space of possible norms. NAH, if true, should be able to pick out the menu of norms to choose from, say, but then it's a separate question of which norms to order off that menu.

The major point I am making here is that my slightly held belief on the Natural Abstractions Hypothesis is that it probably holds, allowing for cases where it does in fact fail, rather than the alternative hypothesis where the natural abstractions hypothesis doesn't hold at all.

Morality/ethics/values is my proposed failure case/error case, since I don't think even the weak version holds, that is I don't think that there a finite set of valid abstractions of values/morals from the environment.

My expectation is that there is an infinite set of valid moralities, and that's not consistent with even the weak version of the natural abstraction hypothesis.

NAH, refers to the idea that lower-dimensional summaries or abstractions used by humans in day-to-day thought and language are natural and convergent across cognitive systems

I guess whether there is such convergence isn't a yes-no-question, but a question of degree?

Very regularily I experience that thoughts I want to convey don't have words that clearly correspond to the concepts I want to use. So often I'll use words/expressions that don't match in a precise way, and sometimes there aren't even words/expressions that can be used to vaguely gesture at what I actually mean.

Nonetheless, our concepts are similar enough, and we have a similar enough understanding of how words/expressions correspond to concepts, for us to be able to communicate quite a lot (we misunderstand each other all the time, but nonetheless there is a considerable range of stuff that we are able to communicate fairly reliably).

Update: I just gained access to the GPT4 API and I decided to ask it to improve the post.

I used the following system prompt: You are an AI assistant that has been tasked in assisting with writing a high-quality post for the site Less Wrong. You may want to look at the writing of Eliezer Yudkowsky, Scott Alexander or Gwern in order to figure out how to produce high-quality writing.

It provided me with the following which seems better written than the original, at least to me, although it didn't improve on the ideas, just the way they were expressed, mostly just by replacing individual words here and there.

Results:

Introduction
The Natural Abstraction Hypothesis (NAH), put forth by John Wentworth, posits that certain natural abstractions exist, which cognitive systems are expected to converge upon. In this essay, we will scrutinize the NAH through the lens of Ludwig Wittgenstein's concept of language games. We will argue that the variability and contextual nature of language games challenge the NAH, as they illustrate that abstractions are neither fixed, universal, nor entirely convergent across cognitive systems.

Wittgenstein's Language Games
Ludwig Wittgenstein, a prominent philosopher of language, introduced the concept of language games in his seminal work, "Philosophical Investigations." Language games are social practices wherein language and meaning are deeply intertwined. Wittgenstein contended that the meaning of words is not static or inherent but derives from their usage within various language games. These language games are context-dependent, diverse, and ever-evolving, reflecting the intricacies of human social practices and cultural contexts.

Critique of the NAH from the Perspective of Language Games
1. Context-dependent and diverse abstractions
The concept of language games undermines the idea of a single set of natural abstractions that cognitive systems can converge upon. Language games are context-dependent, signifying that the meaning of words, concepts, and abstractions relies on the specific language game being played. Furthermore, the diversity of language games indicates that a multitude of abstractions related to a single concept may exist, each stemming from distinct linguistic and cultural practices. This amalgamation of context-dependence and diversity casts doubt on the notion of convergence towards a single, natural abstraction.

2. Evolution of language games
Language games are not static entities; they evolve over time as social practices and cultural contexts shift. This dynamism suggests that abstractions themselves may be subject to change, further complicating the idea of fixed, natural abstractions. The NAH's assumption that cognitive systems will converge on a consistent set of abstractions may be undermined by the fluid nature of language games and the abstractions they produce.

3. Human-Compatibility
The NAH posits human compatibility, referring to the idea that the lower-dimensional summaries or abstractions employed by humans in day-to-day thought and language are natural and convergent across cognitive systems. However, Wittgenstein's language games demonstrate that human thought and language are diverse, context-dependent, and constantly evolving. This variability complicates the notion that human-compatible abstractions are universally "natural" and brings into question the assumption that disparate cognitive systems will necessarily converge on the same abstractions.

Conclusion
Wittgenstein's concept of language games offers a potent critique of the Natural Abstraction Hypothesis. The context-dependent, diverse, and evolving nature of language games emphasizes the complexities of human linguistic and conceptual practices, which challenge the idea of a fixed, universally natural set of abstractions. By accentuating the importance of context and diversity, Wittgenstein's language games encourage us to reevaluate the assumptions of the NAH and investigate alternative frameworks for understanding the development and convergence of abstractions in cognitive systems.

 

I tried asking it to include examples which didn't work so well. So I instead asked it to generate four examples for each of the three points.

Overall, it's a lot easier to produce quality content when you can increase the maximum content generation size and the ability to generate examples seems useful, but you still need to do a lot manually yourself.

Results:

Certainly! Here are four different examples for each of the three points:

1. Context-dependent and diverse abstractions
a) The concept of "freedom" can yield different abstractions in various political contexts, such as individual liberty in a liberal democracy or collective autonomy in a socialist system (less compelling example, maybe people in a socialist system adopt that definition of freedom, hard for me to gauge really)
b) The idea of "family" can have diverse abstractions across cultures, with some emphasizing nuclear family structures, others prioritizing extended family networks, and some valuing chosen families or non-blood relations (solid example)
c) The notion of "success" can differ across societies, with some emphasizing material wealth and career achievements, while others prioritize personal happiness, community involvement, or spiritual growth (solid example)
d) The concept of "ownership" can manifest differently in various economic and cultural systems, from the individual property rights in capitalist societies to communal ownership in certain indigenous cultures or socialist systems (solid example)

2. Evolution of language games
a) The term "gentleman" has evolved over time, initially referring to a man of noble birth, then encompassing courteous and chivalrous behavior, and now representing a well-mannered and respectful man (not clear how the second and third points differ)
b) The concept of "privacy" has shifted with technological advancements, expanding from physical solitude or personal correspondence to include digital data and online interactions (meaning has probably shifted, unsure if it's the best example as it hasn't shifted that much)
c) The meaning of "friendship" has evolved as social media platforms have emerged, broadening the term to include not only close personal bonds but also casual acquaintances and online connections (not sure that the meaning of friendship has broadened vs. an additional usage)
d) The understanding of "health" has changed over time, moving from a focus on the absence of disease to a more holistic perspective that considers mental, emotional, and social well-being (solid example)

3. Human-Compatibility
a) The perception of "beauty" varies significantly across cultures and cognitive systems, with different aesthetic preferences and ideals shaping the development of distinct abstractions (maybe?)
b) The concept of "intelligence" may have diverse abstractions in different cognitive systems, with some focusing on analytical and logical abilities, others prioritizing emotional intelligence, and some considering creativity or problem-solving skills
c) The idea of "trust" can be understood differently across cognitive systems, such as trust in institutions, trust in personal relationships, or trust in technology and artificial intelligence (seems like mostly the concept in different contexts)
d) The notion of "fairness" may be represented differently in various cognitive systems, such as in human cultures with different moral values or in AI systems with distinct ethical frameworks and decision-making processes (kind of relevant)

Merging these into the original gives:

Introduction
The Natural Abstraction Hypothesis (NAH), put forth by John Wentworth, posits that certain natural abstractions exist, which cognitive systems are expected to converge upon. In this essay, we will scrutinize the NAH through the lens of Ludwig Wittgenstein's concept of language games. We will argue that the variability and contextual nature of language games challenge the NAH, as they illustrate that abstractions are neither fixed, universal, nor entirely convergent across cognitive systems.

Wittgenstein's Language Games
Ludwig Wittgenstein, a prominent philosopher of language, introduced the concept of language games in his seminal work, "Philosophical Investigations." Language games are social practices wherein language and meaning are deeply intertwined. Wittgenstein contended that the meaning of words is not static or inherent but derives from their usage within various language games. These language games are context-dependent, diverse, and ever-evolving, reflecting the intricacies of human social practices and cultural contexts.

Critique of the NAH from the Perspective of Language Games
1. Context-dependent and diverse abstractions
The concept of language games undermines the idea of a single set of natural abstractions that cognitive systems can converge upon. Language games are context-dependent, signifying that the meaning of words, concepts, and abstractions relies on the specific language game being played. Furthermore, the diversity of language games indicates that a multitude of abstractions related to a single concept may exist, each stemming from distinct linguistic and cultural practices. This amalgamation of context-dependence and diversity casts doubt on the notion of convergence towards a single, natural abstraction. For example, the notion of "success" can differ across societies, with some emphasizing material wealth and career achievements, while others prioritize personal happiness, community involvement, or spiritual growth (solid example)
 

2. Evolution of language games
Language games are not static entities; they evolve over time as social practices and cultural contexts shift. This dynamism suggests that abstractions themselves may be subject to change, further complicating the idea of fixed, natural abstractions. The NAH's assumption that cognitive systems will converge on a consistent set of abstractions may be undermined by the fluid nature of language games and the abstractions they produce. For example, the understanding of "health" has changed over time, moving from a focus on the absence of disease to a more holistic perspective that considers mental, emotional, and social well-being

3. Human-Compatibility
The NAH posits human compatibility, referring to the idea that the lower-dimensional summaries or abstractions employed by humans in day-to-day thought and language are natural and convergent across cognitive systems. However, Wittgenstein's language games demonstrate that human thought and language are diverse, context-dependent, and constantly evolving. This variability complicates the notion that human-compatible abstractions are universally "natural" and brings into question the assumption that disparate cognitive systems will necessarily converge on the same abstractions. For example, the concept of "intelligence" may have diverse abstractions in different cognitive systems, with some focusing on analytical and logical abilities, others prioritizing emotional intelligence, and some considering creativity or problem-solving skills

Conclusion
Wittgenstein's concept of language games offers a potent critique of the Natural Abstraction Hypothesis. The context-dependent, diverse, and evolving nature of language games emphasizes the complexities of human linguistic and conceptual practices, which challenge the idea of a fixed, universally natural set of abstractions. By accentuating the importance of context and diversity, Wittgenstein's language games encourage us to reevaluate the assumptions of the NAH and investigate alternative frameworks for understanding the development and convergence of abstractions in cognitive systems.

The existence of natural abstractions is entirely compatible with the existence of language games. There are correct and incorrect ways to play language games.

Dialogue trees are the substrate of language games, and broader reality is the substrate of dialogue trees. Dialogue trees afford taking dialogical moves that are more or less arbitrary. A guy who goes around saying "claiming land for yourself and enforcing your claim is justice; Nozick is intelligent and his entitlement theory of justice vindicates my claim" will leave exact impressions on exact types of people, who will in turn respond in ways that are characteristic of themselves. Every branch of the dialogue tree will leave an audience with an impression of who is right, and some audiences have measurably better calibration.

Just because no one can draw perfect triangles doesn't mean it's nonsense to talk about such things.

There are correct and incorrect ways to play language games.

That's the crux. Wittgenstein himself believed otherwise and spent the most part of the book arguing against it. I think he makes good points.

At one point, he argues that there's no single correct interpretation for "What comes next in the sequence: '2, 4, 6, 8, 10, 12, ...?'" 

Maybe this goes a bit too far. :) I think he's right in some nitpicky sense, but for practical purposes, sane people will say "14" every time and that works well for us.

We can see this as version of realism vs anti-realism debates: realism vs anti-realism about natural abstractions.  As I argue in the linked post, anti-realism is probably the right way of looking at most or even all of these, but that doesn't mean "anything goes." Sometimes there's ambiguity about our interpretations of things, but reality does have structure, and "ambiguity" isn't the same as "you can just make random stuff up and expect it to be useful."

That's the crux. Wittgenstein himself believed otherwise and spent the most part of the book arguing against it.


I could be wrong, but my understanding was that Wittgenstein did think there were correct and incorrect ways of playing language games, but that this was context-dependent, and of course, someone could always choose to play another language game instead.

According to this article, the point being made with the sequences is that the correct completion is subject to interpretation and even though I could try to explain how the sequence should be interpreted, this explanation would itself be subject to interpretation, leading to an infinite regress.  Wittgenstein ends up arguing in the end that we learn things through training rather than explanation.

Yeah, what I meant was the belief that there's no incorrect way to set up a language game.

14 is certainly the most likely continuation but it could also be

  • 16 if it's a list of numbers k where k^2 + 7 is prime
  • 18 if it's a list of numbers of the form 3^i +/- 3^j

These continuations are unlikely in general but are the kind of thing that might show up in an academic mathematics paper.

and some audiences have measurably better calibration.

It's not straightforward in all contexts to establish what counts as good calibration. It's straightforward for empirical forecasting, but if we were to come up with a notion like "good calibration for ethical judgments," we'd have to make some pretty subjective judgment calls. Similarly, something like "good calibration for coming up with helpful abstractions for language games" (which we might call "doing philosophy" or a subskill of it) also seems (at least somewhat) subjective. 

That doesn't mean "anything goes," but I don't yet see how your point about dialogue trees applies to "maybe a society of AIs would build abstractions we don't yet understand, so there'd be a translation problem between their language games and ours." 

Sorry, I can't quite follow why you are saying that dialogue trees are the substrate of language games or how this ties into the arguments. Any chance you could clarify?