Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

[Metadata: crossposted from https://tsvibt.blogspot.com/2022/11/are-there-cognitive-realms.html. First completed November 16, 2022. This essay is more like research notes than exposition, so context may be missing, the use of terms may change across essays, and the text may be revised later; only the versions at tsvibt.blogspot.com are definitely up to date.]

Are there unbounded modes of thinking that are systemically, radically distinct from each other in relevant ways?

Note: since I don't know whether "cognitive realms" exist, this essay isn't based on clear examples and is especially speculative.

Realms

Systemically, radically distinct unbounded modes of thinking

The question is, are there different kinds--writ large--of thinking?

To the extent that there are, interpreting the mental content of another mind, especially one with different origins than one's own, may be more fraught than one would assume based on experience with minds that have similar origins to one's own mind.

Are there unbounded modes of thinking that are systemically, radically distinct from each other?

"Unbounded" means that there aren't bounds on how far the thinking can go, how much it can understand, what domains it can become effective in, what goals it can achieve if they are possible.

"Systemically" ("system" = "together-standing-things") means that the question is about all the elements that participate in the thinking, as they covary / coadapt / combine / interoperate / provide context for each other.

"Radical" (Wiktionary) does not mean "extreme". It comes from the same etymon as "radish" and "radix" and means "of the root" or "to the root"; compare "eradicate" = "out-root" = "pull out all the way to the root", and more distantly through PIE *wréh₂ds the Germanic "wort" and "root". Here it means that the question isn't about some mental content in the foreground against a fixed background; the question asks about the background too, the whole system of thinking to its root, to its ongoing source and to what will shape it as it expands into new domains.

Terms

Such a mode of thinking could be called a "realm". A cognitive realm is an overarching, underlying, systemic, total, architectural thoughtform that's worth discussing separately from other thoughtforms. A realm is supposed to be objective, a single metaphorical place where multiple different minds or agents could find themselves.

Other words:

  • systemic thoughtform
  • system of thought, system of thinking
  • cognitive style
  • state of mind
  • cluster / region in mindspace
  • mode of being
  • species of thinking

Realm vs. domain

A domain is a type of task, or a type of environment. A realm, on the other hand, is a systemic type of thinking; it's about the mind, not the task.

For the idea of a domain see Yudkowsky's definition of intelligence as efficient cross-domain optimization power. Compare also domain-specific programming languages, and the domain of discourse of a logical system.

It might be more suitable for a mind to dwell in different realms depending on what domain it's operating in, and this may be a many-to-many mapping. Compare:

The mapping from computational subsystems to cognitive talents is many-to-many, and the mapping from cognitive talents plus acquired expertise to domain competencies is also many-to-many, [...].

From "Levels of Organization in General Intelligence", Yudkowsky (2007).

Domains are about the things being dealt with; it's a Cartesian concept (though it allows for abstraction and reflection, e.g. Pearlian causality is a domain and reprogramming oneself is a domain). Realms are about the thing doing the dealing-with.

Realm vs. micro-realm

A micro-realm is a realm except that it's not unbounded. It's similar to a cognitive faculty, and similar to a very abstract domain, but includes them both; it's "the whole mental area" of dealing with an abstract domain, which includes the (abstract) subject matter of the domain as well as cognitive faculties and systematic ways of thinking about that domain. For example, doing math could be called a micro-realm: it involves subject matter, and many stereotyped mental operations, and many stereotyped and interrelated ways of a mind reprogramming itself in accordance with what's suitable for doing math.

Like the notion of "realm", I'm not sure whether "micro-realm" carves much of anything at its joints. If it does, thinking then consists of shuttling questions and tasks between micro-realms, operating in the micro-realms, and then shuttling answers and performances between micro-realms, metaphorically a little like the Nelson-Oppen method.

Possible examples

  • Different ontologies. Two minds speaking different languages, using different conceptual systems, could be in different realms.
  • Evolution. Since evolution (of a species) to a large extent fails to share knowledge within itself, insofar as it thinks, it thinks less in a language than we do.
  • Processes with fixed outer loops. (Compare "Known-algorithm non-self-improving agent".) Evolution is arguably an example: evolution plays a single note of optimization--bifurcating through speciation, resounding off of different environments, waxing and waning as selection pressure waxes and wanes--but still a single note, increasing or decreasing allele frequencies based on inclusive genetic fitness (though mate choice shades into a richer sort of optimization). (This is arguable because there's more structure to the genome, which could be viewed as related in some ways to ontologies as humans use them.) MuZero is arguably another example: it's just a search, it doesn't think in another way.
  • Different epistemologies. The rules governing what belief-like things two minds hold could be so different that the minds systematically hold different beliefs or don't even both hold beliefs in the same sense. For example, a mind could act as though they believe whatever some authority tells them is the case. Those belief-like attitudes can only be construed as Bayesian under weird prior beliefs, such as being near-perfectly confident that the authority is correct and has been understood correctly, but those attitudes are still belief-like in that the mind might locally act instrumentally in the same way as a Bayesian agent would act instrumentally if it happened to believe the sentence the authority spoke.
  • Non-epistemic thinking. An agent might rearrange itself to be suitable for different tasks in a way that's not easy to understand as following rules that produce accurate beliefs. Again, evolution may be an example: although segments of the genome can sometimes be taken to correspond to something (e.g. a niche or element of the environment), they don't seem to constitute propositions (besides a monotone "this code-fragment is useful in this context"), and it's not obvious to me that you'd want to say that an agent has beliefs constituted by something other than propositions. It might be wrong to call this "thinking", but it's at least rearrangement towards suitability, and in the case of evolution can be very strong, strong enough to matter. Of course, the laws of information theory still apply; the point is that this sort of mind or agent may not be well-interpretable as having beliefs in the sense of propositions, which is a main meaning of the everyday word "belief".
  • Different axiologies. Maybe minds can have basically different ways to ultimately judge which actions to take. E.g. virtue ethics, deontology, consequentialism; CDT, EDT, TDT, UDT, FDT. Also different would be agents that: are corrigible to another agent; have a fixed goal in a fixed ontology that they refer all their creativity back to serving; are an expanding coalition of knowledge-bearers; are a coalition held together by group-enforced anarchy; copy goals from other agents; copy behaviors from other agents or play stereotyped roles; or derive goals by imputing agency to their past behavior.
  • Non-axiological thinking. I don't know if this makes sense or is possible, but maybe one could have a mind that "doesn't take meaningful actions" and "just understands stuff".
  • Radically partial thinking. Sometimes people think in a way that isn't easy to understand except as playing a role as part of a larger system. E.g. in deep discourse, thinking through something newly together, taking just one of the participants alone would put them in a different state of mind that in some cases wouldn't recover the insights from the discourse. E.g. teammates in a game with high-bandwith communication.
  • Non-ontic thinking. Is it possible to think without using language, without using concepts the way we use concepts? See Foucault's aphasiac, from the preface of "The Order of Things":

It appears that certain aphasiacs, when shown various differently coloured skeins of wool on a table top, are consistently unable to arrange them into any coherent pattern; as though that simple rectangle were unable to serve in their case as a homogeneous and neutral space in which things could be placed so as to display at the same time the continuous order of their identities or differences as well as the semantic field of their denomination. Within this simple space in which things are normally arranged and given names, the aphasiac will create a multiplicity of tiny, fragmented regions in which nameless resemblances agglutinate things into unconnected islets; in one corner, they will place the lightest-coloured skeins, in another the red ones, somewhere else those that are softest in texture, in yet another place the longest, or those that have a tinge of purple or those that have been wound up into a ball. But no sooner have they been adumbrated than all these groupings dissolve again, for the field of identity that sustains them, however limited it may be, is still too wide not to be unstable; and so the sick mind continues to infinity, creating groups then dispersing them again, heaping up diverse similarities, destroying those that seem clearest, splitting up things that are identical, superimposing different criteria, frenziedly beginning all over again, becoming more and more disturbed, and teetering finally on the brink of anxiety.

Implications

If there are different realms, then minds in different realms might be more or less safe, alignable, or interpretable. Interpretability might depend on which realm the interpreter is in.

Do realms exist?

What does it mean for realms to exist?

If it were useful to think about minds in terms of realms, there'd be the same problems as with thinking in terms of languages or species, e.g. the existence of dialect continua or analogously ring species. So we could ask different questions about the existence of cognitive realms, e.g.:

  • How different can minds be? How differently can they think? How thorough / radical / to-the-core can these differences be?
  • How uninterpretable can one mind be to another?
  • Are there contexts in which minds have to radically (to the root) and/or systemically specialize to be competitive?
  • At higher capability levels, are minds convergent? Any mind can simulate (perhaps a weaker version of) any other mind, and maybe any strategy can be stolen, but these stolen strategies could have a parasystemic relationship to the mind.
  • If a mind is capable of a pivotal act, does that imply something about the realm it's in?
  • Are there well-separated large-scale clusters in mindspace? Are there basins of attraction around the clusters?
  • Are these stereotyped ways of thinking mutually exclusive? In what senses can there be one mind or agent that dwells in multiple realms?
  • Are there alignment strategies, interpretation strategies, or safety properties that would apply in some realms but not others?

Reasons realms might exist

  • Bottlenecks. There are sometimes bottlenecks in minds. E.g. in some senses of understanding the world, you can't do better than having a probability distribution over possible worlds, so there's a bottleneck of probability mass. E.g. you have to decide how to allocate computational resources, so there's a bottleneck of what you think about. E.g. at any given time you have finitely many actuators, so you can only take so many actions. E.g. if you're going to be legible to other agents for purposes of coordination, you have to pick some decision process that's legible and that determines your behavior. E.g. if you change a mental element that's related to many other elements, then you either affect the context of those other elements, or you create a Doppelgänger of the element (like choosing between pleiotropy or paralogy). Bottlenecks might create narrowings: you can't be too spread out in mindspace, which maybe implies you have to be in one realm or another.
  • Autogenous specialization. One reason species speciate is that they specialize for different niches. Minds are for expanding one's external niche, but minds also create internal niches for their own elements, in the same way that a genome pool provides a niche for each genetic locus. Which allele is best at a locus depends on the other genes it will be expressed along with (as well as the "external" environment), and which element is most suited in some mental context depends on the other elements in that context (as well as the "external" task). There are different systems of interoperability, and these systems induce themselves. E.g., programming in a way that follows the discipline that functions don't mutate arguments, will set up a situation where a function that does mutate an argument is confusing and breaks things--such a function is ill-suited to the niche induced by the rest of the code. Autogenous (= self-generated, self-originating) specialization that's also autocausal (self-inducing) can run away with itself, leading to a lot of specialization distance--very distinct species or minds.
  • Activator-inhibitor discretization. Activator-inhibitor systems can create discrete regions out of a roughly uniform and continuous substrate. As a familiar example, waves form this way: the wind pushing on the raised part of the wave causes the wave to become more raised, which causes the wind to catch on the wave more forcefully (self-activation), while the lowered region catches even less wind, and at the same time, the raised part flattens out because the trough lets the raised water fall into it (inhibition). See Wiki. E.g.: Modified from https://pmontalb.github.io/TuringPatterns/. Autocausal autogenous specialization (activator) combined with bottlenecks (inhibitor) might create analogous patterns in mindspace. As an analogy / example: Semitic languages such as Hebrew use a triliteral system, where most roots have three letters and then words--nouns, inflected verbs, adjectives--are formed from roots by patterns. This is an unusual system. Why does it exist? A possible explanation is a self-reinforcing loop between lexicon and morphology: the (often transfix) patterns of inflection are regularized to assume triliteral roots because there are many triliteral roots, and when roots are coined they're coined to be triliteral because the patterns of inflection assume triliteral roots.
  • Local convergence within a cluster. There may be basins of attraction around clusters in mindspace, because the autogenous niche created by the presence of the cluster-defining features might be enough to induce an instrumental demand for specific mental elements, so any mind in that cluster will have a demand for those elements. As an analogy: a species can specialize to be water-dwelling and fast-moving; if it does, then it will (convergently, instrumentally) evolve smooth skin, whatever its phylogenetic origin.
  • Non-convexity of synergy. Minds maybe can't just be combined and superseded by mixing together all their elements. As an analogy, Magic: The Gathering is a game where you make decks of spells and then fight other decks. Many cards have effects that combine with the effects of specific other cards to be more powerful than the sum of the two cards on their own. Whole decks are built to be very powerful by incorporating many such synergies. But if you took two very good decks with very different styles and made a deck that's just half of one and half of the other, it would probably be a lot worse.
  • Non-integrability. For the reasons described here, it might not always be feasible to integrate different kinds of cognition into a mind or agent that's well-described as unified.

Reasons realms might not exist

  • Absoluteness of structure. It could be that many or all structures are absolute: in and for any mind, the more they're usefully understood, the more they take the same form. (Or in other words, the cosmos might consist of Things.)
  • Absoluteness of constraints. Some constraints on the structure of mind are absolute. E.g. Bayesian or information-theoretic limits on empirical knowledge, and computational complexity constraints on problem-solving, apply to any mind of any kind. Since all minds have to grapple with these constraints, all minds will have some pressures in common affecting their shape.
  • Undoing bottlenecks with abstraction. Some bottlenecks listed above can be undone by thinking abstractly. For example, there may not be a conservation law of attention or interest, because a mind can e.g. think about all "nice" geometric shapes at once using the abstraction "smooth compact Riemannian 2-manifold".
  • Integrability. Maybe anything that's worth understanding can be understood by a unified mind after all. IDK. In that case, multiple synergies that don't naively mix can be combined.
  • Strategy-stealing. To the extent that it's possible for a mind to perform almost as well as any other mind on any given task, cognitive realms can't be meaningful in terms of performance. See "The strategy-stealing assumption", and compare this article on the tractability of goals not depending too much on their supergoals.
New Comment
2 comments, sorted by Click to highlight new comments since:
  • Non-epistemic thinking. An agent might rearrange itself to be suitable for different tasks in a way that's not easy to understand as following rules that produce accurate beliefs. Again, evolution may be an example: although segments of the genome can sometimes be taken to correspond to something (e.g. a niche or element of the environment), they don't seem to constitute propositions (besides a monotone "this code-fragment is useful in this context"), and it's not obvious to me that you'd want to say that an agent has beliefs constituted by something other than propositions. It might be wrong to call this "thinking", but it's at least rearrangement towards suitability, and in the case of evolution can be very strong, strong enough to matter. Of course, the laws of information theory still apply; the point is that this sort of mind or agent may not be well-interpretable as having beliefs in the sense of propositions, which is a main meaning of the everyday word "belief".

I think a good example of this is minds that optimize for competitiveness in decision theory. For example, negotiation and persuasion.

the classical understanding of negotiation often recommends "rationally irrational" tactics in which an agent handicaps its own capabilities in order to extract concessions from a counterparty: for example, in the deadly game of chicken, if I visibly throw away my steering wheel, oncoming cars are forced to swerve for me in order to avoid a crash, but if the oncoming drivers have already blindfolded themselves, they wouldn't be able to see me throw away my steering wheel, and I am forced to swerve for them.

Also, skill at self-preservation could been continuously optimized/selected for at all stages of the evolution of intelligence, including early stages. This includes the neolithic period, where language existed but not written language, and extremely limited awareness of how to succeed at thinking or even what thinking is.

It seems plausible that the reason [murphyjitsu] works for many people (where simply asking “what could go wrong?” fails) is that, in our evolutionary history, there was a strong selection pressure in favor of individuals with a robust excuse-generating mechanism. When you’re standing in front of the chief, and he’s looming over you with a stone axe and demanding that you explain yourself, you’re much more likely to survive if your brain is good at constructing a believable narrative in which it’s not your fault.

It wouldn't be surprising if non-epistemic thinking was already substantially evolved and accessible/retrievable in humans, in which case research into distant cognitive realms is substantially possible with resources that are currently available.

For example, negotiation and persuasion

Oh yeah, that's (potentially) a great example. At least in the human regime, it does seem like you can get sets of people relating to each other so that they're very deeply into conflict frames. I wonder if that can extend to arbitrarily capable / intelligent agents.