In this post I lay out a model of beliefs and communication that identify two types of things we might think of as ‘beliefs,’ how they are communicated between people, how they are communicated within people, and what this might imply about intellectual progress in some important fields. As background, Terence Tao has a blog post describing three stages of mathematics: pre-rigorous, rigorous, and post-rigorous. It’s only about two pages; the rest of this post will assume you’ve read it. Ben Pace has a blog post describing how to discuss the models generating an output, rather than the output itself, which is also short and is related, but has important distinctions from the model outlined here.
[Note: the concept for this post comes from a talk given by Anna Salamon, and I sometimes instruct for CFAR, but the presentation in this post should be taken to only represent my views.]
If a man will begin with certainties, he shall end with doubts, but if he will be content to begin with doubts he shall end in certainties. -- Francis Bacon
Probably the dominant model of conversations among philosophers today is Robert Stalnaker’s. (Here’s an introduction.) A conversation has a defined set of interlocutors, and some shared context, and speech acts add statements to the context, typically by asserting a new fact.
I’m not an expert in contemporary philosophy, and so from here on out this is my extension of this view that I’ll refer to as ‘formal.’ Perhaps this extension is entirely precedented, or perhaps it’s controversial. My view focuses on situations where logical omniscience is not assumed, and thus simply pointing out the conclusion that arises from combining facts can count as such an assertion. Proper speech considers this and takes inferential distance into account; my speech acts should be derivable from our shared context or an unsurprising jump from them. Both new logical facts and environmental facts count as adding information to the shared context. That I am currently wearing brown socks while writing this part of the post is not something you could derive from our shared context, but is nevertheless ‘unsurprising.’
It’s easy to see how a mathematical proof might fit into this framework. We begin with some axioms and suppositions, and then we compute conclusions that follow from those premises, and eventually we end up at the theorem that was to be proved.
If I make a speech act that’s too far of a stretch--either because it disagrees with something in the context (or your personal experience), or is just not easily derivable from the common context--then the audience should ‘beep’ and I should back up and justify the speech act. A step in the proof that doesn’t obviously follow means I need to expand the proof to make it clear how I got from A to B, or how a pair of statements that appear contradictory is in fact not contradictory. (“Ah, by X I meant the restricted subset X’, such that this counterexample is excluded; my mistake.”)
This style of conversation seems to be minimizing surprise on the low level; from moment to moment, actions are being taken in a way that views justification and validation by independent sources as core constraints. What is this good for? Interestingly, the careful avoidance of surprises on the low level permits surprises on the high level, as a conclusion reached by airtight logic can be as trustworthy as the premises of that logic, regardless of how bizarre the conclusion seems. A plan fleshed out with enough detail that it can be independently reconstructed by many different people is a plan that can scale to a large organization. The body of scientific knowledge is communicated mostly this way; Nullius in verba requires this sort of careful communication because it bans the leaps one might otherwise make.
One way to model communication is a function that takes objects of a certain type and tries to recreate them in another place. A telephone takes sound waves and attempts to recreate them elsewhere, whereas an instant messenger takes text strings and attempts to recreate them elsewhere. So conjugate to the communication methodology is ‘the thing that can be communicated by this methodology’; I’m going to define ‘public positions’ as the sort of beliefs that are amenable to communication through ‘formal communication’ (this style where you construct conclusions out of a chain of simple additions to the pre-existing context).The ‘public’ bit emphasizes that they’re optimized for justification or presentation; many things I believe don’t count as public positions because I can’t reach them through this sort of formal communication. For example, I find the smell of oranges highly unpleasant; I can communicate that fact about my preferences through formal communication but can’t communicate the preference itself through formal communication. The ‘positions’ bit emphasizes that they are defensible and legible; you can ‘know where I stand’ on a particular topic.
I’m going to call a different sort of belief one’s ‘private guts.’ By ‘guts,’ I’m pointing towards the historical causes of a belief (like the particular bit of my biochemistry that causes to smell distasteful), or to the sense of a ‘gut feeling.’ By private, I’m pointing towards the fact that this is often opaque or not shaped like something that’s communicable, rather than something deliberately hidden. If you’re familiar with Gendlin’s Focusing, ‘felt senses’ are an example of private guts.
What are private guts good for? As far as I can tell, lizards probably don’t have public positions, but they probably do have private guts. That suggests those guts are good for predicting things about the world and achieving desirable world states, as well as being one of the channels by which the desirability of world states is communicated inside a mind. It seems related to many sorts of ‘embodied knowledge’, like how to walk, which is not understood from first principles or in an abstract way, or habits, like adjective order in English. A neural network that ‘knows’ how to classify images of cats, but doesn’t know how it knows (or is ‘uninterpretable’), seems like an example of this. “Why is this image a cat?” -> “Well, because when you do lots of multiplication and addition and nonlinear transforms on pixel intensities, it ends up having a higher cat-number than dog-number.” This seems similar to gut senses that are difficult to articulate; “why do you think the election will go this way instead of that way?” -> “Well, because when you do lots of multiplication and addition and nonlinear transforms on environmental facts, it ends up having a higher A-number than B-number.” Private guts also seem to capture a category of amorphous visions; a startup can rarely write a formal proof that their project will succeed (generally, if they could, the company would already exist). The postrigorous mathematician’s hunch falls into this category, which I’ll elaborate on later.
There are now two sorts of interesting communication to talk about: the process that coheres public positions and private guts within a single individual, and the process that communicates private guts across individuals.
COHERENCE, FOCUSING, AND SCIENCE
Much of CFAR’s focus, and that of the rationality project in general, has involved taking people who are extremely sophisticated at formal communication and developing their public positions, and getting them to notice and listen to their private guts. An example, originally from Julia Galef, is the ‘agenty duck.’ Imagine a duck whose head points in one direction (“I want to get a PhD!”) and whose feet are pointed in another (mysteriously, this duck never wants to work on their dissertation). Many responses to this sort of intrapersonal conflict seem maladaptive; much better for the duck to have head and feet pointed in the same direction, regardless of which direction that is. An individual running a coherence process that integrates the knowledge of the ‘head’ and ‘feet’, or the public positions and the private guts, will end up more knowledgeable and functional than an individual that ignores one to focus on the other.
Discovering the right coherence process is an ongoing project, and even if I knew it as a public position it would be too long for this post. So I will merely leave some pointers and move on. First, the private guts seem highly trainable by experience, especially through carefully graduated exposure. Second, Focusing and related techniques (like Internal Double Crux) seem quite effective at searching through the space of articulable / understandable sentences or concepts in order to find those that resonate with the private guts, drawing forth articulation from the inarticulate.
It’s also worth emphasizing the way in which science depends on such a coherence process. The ‘scientific method’ can be viewed in this fashion: hypotheses can be wildly constructed through any method, because hypotheses are simply proposals rather than truth-statements; only hypotheses that survive the filter of contact with reality through experimentation graduate to full facts, at which point their origin is irrelevant, be it induction, a lucky guess, or the unconscious mind processing something in a dream.
Similarly for mathematicians, according to Tao. The transition from pre-rigorous mathematics to rigorous mathematics corresponds to being able to see formal communication and public positions as types, and learning to trust them over persuasion and opinions. The transition from rigorous mathematics to post-rigorous mathematics corresponds to having trained one’s private guts such that they line up with the underlying mathematical reality well enough that they generate fruitful hypotheses.
Consider automatic theorem provers. One variety begins with a set of axioms, including the negated conclusion, and then gradually expands outwards, seeking to find a contradiction (and thus prove that the desired conclusion follows from the other axioms). Every step of the way proceeds according to the formal communication style, and every proposition in the proof state can be justified through tracing the history of combinations of propositions that led from the initial axioms to that proposition. But the process is unguided, reliant on the swiftness of computer logic to handle the massive explosion of propositions, almost all of which will be irrelevant to the final proof. The human mathematician instead has some amorphous sense of what the proof will look like, sketching a leaky argument that is not correct in the details, but which is correctable. Something interesting is going on in the process that generates correctable arguments, perhaps even more interesting than what’s going on in the processes that trivially generate correct arguments by generating all possible arguments and then filtering.
STARTUPS, DOUBLE CRUX, AND CIRCLING
Somehow, people are sometimes able to link up their private guts with each other. This is considerably more fraught than linking up public positions; positions are of a type that is optimized for verifiability and reconstruction, whereas internal experiences, in general, are not. Even if we’re eating the same cake, how would we even check that our internal experience of eating the cake is similar? What about something simpler, like seeing the same color?
While the abstraction of formal conversation is fairly simple, it’s still obvious that there are many skills related to correct argumentation. Similarly, there seems to be a whole family of skills related to syncing up private guts, and rather than teaching those skills, this section will again by a pointer to where those skills could be learned or trained. Learning how to reproduce music is related to learning how to participate in jam sessions, but the latter is a much closer fit to this sort of communication.
The experience of startups is that small teams are best, primarily because of the costs of coordinative communication. Startups are often chasing an amorphous, rapidly changing target; a team that’s able to quickly orient in the same direction and move together, or trust in the guts of each other rather than requiring elaborate proofs, will often perform better.
While Double Crux can generate a crisp tree of logical deductions from factual disagreements, it often instead exposes conflicting intuitions or interpretations. While formal communication involves a speaker optimizing over speech acts to jointly minimize surprise and maximize distance towards their goal, double crux instead involves both parties in the optimization, and often causes them to seek surprises. A crux is something that would change my mind, and I expose my cruxes in case you disagree with them, seeking to change my mind as quickly as possible.
Cruxes also respect the historical causes of beliefs; when I say “my crux for X is Y,” I am not saying that Y should cause you to believe X, only that not-Y would cause me to believe not-X. This weaker filter means many more statements are permissible, and my specific epistemic state can be addressed, rather than playing a minimax game by which all possible interlocutors would be pinned down by the truth. In Stalnakerian language, rather than needing to emit statements that are understandable and justifiable by the common context, I only need the weaker restriction that those statements are understandable in the common context and justifiable in my private context.
Circling is also beyond the scope of this post, except as a pointer. It seems relevant as a potential avenue for deliberate practice in understanding and connecting to the subjective experience of others in a way that perhaps facilitates this sort of conversation.
As mentioned, this post is seeking to set out a typology, and perhaps crystallize some concepts. But why think these concepts are useful?
Primarily, because this seems to be related to the way in which rationalists differ from other communities with similar interests, or from their prior selves before coming a rationalist, in a way that seems related to the difference between postrigorous mathematicians and rigorous mathematicians. Secondarily, because many contemporary issues of great practical importance require correctly guessing matters that are not settled. Financial examples are easy (“If I buy bitcoin now, will it be worth more when I sell it by more than a normal rate of economic return?”), but longevity interventions have a similar problem (“knowing whether or not this works for humans will take a human lifetime to figure out, but by that point it might be too late for me. Should I do it now?”), and it seems nearly impossible to reason correctly about existential risks without reliance on private guts (and thus on methods to tune and communicate those guts).