Metaphilosophy I: Philosophy as Extracting Implicit Patterns from S1 into S2

interstice

Hello LW. As I've mentioned I'm starting a blog about philosophy and physics. Here's the first post proper, about "meta-philosophy", i.e. what even is philosophy, and how could we make a program that does philosophy? I think people here might find section 1.3 most interesting. Without further ado:

Intro

1.0 Since I'm planning to write some "philosophy" of a sort, I'm going to explain what I mean by that term and how and why such endeavors are justified. As we'll see, it's inevitable that the explanation is circular to an extent.

1.0.0 I'm not really going to try to explain or justify my account here in much detail, hopefully just enough to make the rest of the document intelligible.
1.0.1 There's two ways you could approach this, internal or external. We could explore what's it like to do philosophy from a first-person perspective, or try to describe from the outside an algorithm or system that can do philosophy. Here I'll do both, internal first, then an external toy model.

1.1. Philosophy as development of highest-level concepts

1.1.1 To start with, here's a sketchy account of epistemology. We have a collection of concepts, interpreted in a very broad sense. We have scientific theories of phenomena, procedural knowledge of how to do things, verbal knowledge of everyday things such as directions, subverbal knowledge of how to open a door, etc.

1.1.2 Concepts are justified by how good they are at predicting things and how useful they are for getting stuff done. Much of this is done via reference to other concepts.

1.1.3 Some concepts have a "hierarchical" relationship to others. For example, high-level concepts such as the cell theory of biology might be used to interpret and plan particular experiments. That theory might in turn be justified in terms of a high-level theory of how "the scientific method" works, plus particular experimental results.

1.1.4 We can climb these hierarchies until we arrive at very general concepts which cannot themselves be justified in terms of higher-level concepts(although they might be circularly justified in terms of each other), e.g. theories of ethics, epistemology, ontology, etc. The development of these highest-level concepts, I call philosophy. ^[1]

1.2 Anti-foundationalism

1.2.1 How is it possible to develop the highest-level concepts, since by definition there are no higher-level concepts which tell us the "right" way to update them?

1.2.2 The answer can be found in noting that most of the process of concept updating is already not done in accordance with a high-level concept. At a low level(for example), sensory concepts are unconsciously learned from experience. At a higher level, abstract concepts can appear in an intuitive "flash", or can be assimilated from culture by rote.

1.2.3 To use a graph theory metaphor, the overall structure of the set of beliefs is not globally tree-like; evidential support can "flow" from both high-level and low-level concepts to either.

1.2.3.1 Continuing the graph metaphor, high-level concepts might correspond to those with a large outdegree, governing many subconcepts. In the vicinity of such a concept, the graph might look locally tree-like
1.2.3.2 Given the circularity, are there any constraints on what the overall set of concepts is like? Data from the external world provides one constraint, and a tendency towards simplicity in the overall "graph" structure another.
1.2.3.3 The picture here is similar to a logical inductor updating based on incoming data, with the difference that I want to allow new "traders" to be introduced.

1.2.4. New high-level concepts can be created if they succesfully compress a large number of lower level ones. The "hierarchical" structure of the seemingly topmost concepts can be changed.

1.2.5 Philsophy, then, consists of the enterprise of intentionally updating these highest level concepts.

1.2.5.1 How can this be done systematically? The process of concept updating can be influenced in many ways. Attention can be brought to bear on particularly confusing concepts. The implications of a set of concepts can be thought through, perhaps refuting or updating them. New information can be learned which ultimately influences a concept(more on this later)
1.2.5.2 For example, the Socratic dialogues draw attention to implicitly held concepts of e.g. justice and draw out their implications.

1.2.6 The basic moves used in philosophical reasoning are not different from "ordinary" reasoning. The difference is their scope, the highest-level concepts.

1.2.6.1 All the contents of the mind ultimately feed back and influence its highest-level concepts.

1.3* An external model

1.3.0 This section can be skipped if you're just interested in the rest of the blog, but people interested in metaphilosophy for its own sake might find it interesting.

1.3.1 Here we're trying to develop an "external" account of philosophy. That is, we want to describe some sort of algorithm or system that "does philosophy" from the outside.

1.3.2 The starting point will be a system that can be described as performing quick, intuitive "System 1" and slow, deliberative "System 2" reasoning.

1.3.2.1 "System 1 reasoning" is the output of a large, learned, relatively shallow computational graph which is learned statistically from data. Let's call the learned model M.
1.3.2.2 "System 2 reasoning" is the result of M iterating over its own past outputs, using either a scratchpad or internal recurrence(or both).
1.3.2.3 Autoregressive transformers are examples. The human brain is probably an example.

1.3.3 Before describing "philsophy" per se, let's describe something that this system can do in general: S2 reasoning can discover patterns in the S1 learned model that are too computationally complex for M's learning algorithm to discover on its own. We can call this "pattern extraction".

1.3.3.1 This requires S2 to be able to query the S1 on various inputs, compare the results, and notice patterns in them. Fortunately, this is pretty easy in the above setup.
1.3.3.2 For example, a shape detector within M might learn to detect a "circle" with an ad hoc sum of threshold functions, while S2 reasoning could recognize(by trying out examples) that points which intuitively match the "circle" concept are always those within a given distance of a point.
1.3.3.3 As a result of recurrence, S2 reasoning is able to find short programs explaining some data, while S1 learning can only find short circuits.
1.3.3.4 Of course, this assumes there is some environmental regularity of the sort that S2 can recognize in the data that M is trained on.

1.3.4 We will describe philosophy as a particular instance of this "extraction". First we need one more assumption: We assume that M contains a "mesa-optimizer". That is, M contains learned patterns which act as an agent, attempting to learn and act in the world and (perhaps) interact with other agents.

1.3.4.1 The agent personas learned by LLMs are examples, humans are arguably another example.

1.3.5 Now "philosophy", for the meta-learned agent in M, is just the above "extraction of patterns in S1", applied to patterns relevant to functioning of the mesa-optimizer.

1.3.5.1 For example, concepts of epistemology might be extracted from recalling particular instances in which "learning" was useful. Decision-theoretic concepts could be extracted from instances where decisions between different choices needed to be made.
1.3.5.2 In a social learning context, a culture might provide stories illustrating good behavior, which are internalized. Concepts of e.g. "honor" could be derived from these stories, and these could be further abstracted.

1.3.6 In summary, we have the following sequence: There are certain general facts about what the world is like and how best to act and learn in it, these facts impinge upon the training of a statistical model M, and some of the facts can be recovered by "System 2" reflection on some of the patterns stored in M. This is a model of "philosophy", described externally.

These ascending levels of abstraction can be seen in the well-known game of "clicking the top link of a wikipedia page until you get to philosophy". ↩︎

20

Metaphilosophy I: Philosophy as Extracting Implicit Patterns from S1 into S2

20

Intro

1.1. Philosophy as development of highest-level concepts

1.2 Anti-foundationalism

1.3* An external model

20

20