LW-style rationality is about winning. Instrumentally, most rationalists think they can win, in part, by seeking truth. I frequently run into comments here where folks take truth as an effectively "sacred" concern in the sense that truth matters to them above all else. In this post, I hope to convince you that, because of the problem of the criterion, seeking truth too hard and holding it sacred work against learning the truth because doing so makes you insufficiently skilled at reasoning about non-truth-seeking agents.

Let's settle a couple things first.

What is truth? Rather than get into philosophical debates about this one, let's use a reasonable working definition that by truth we mean "accurate predictions about our experiences". This sort of truth is nice because it makes minimal metaphysical claims (e.g. no need to suppose something like an external reality required by a correspondence theory of truth) and it's compatible with Bayesianism (you strive to believe things that match your observations). Also, it seems to be the thing we actually care about when we seek truth: we want to know things that tell us what we'll find as we experience the world.

What's the problem of the criterion? Really, you still don't know? First time here? In short, the problem of the criterion is the problem that to know something means you know how to know it, but knowing how to know something is knowing something, so it creates an infinite loop (we call this loop epistemic circularity). The only way to break the loop is to ground knowledge (information you believe predicts your experiences, i.e. stuff you think is true) in something other than more knowledge. That something is purpose.

So, what's the problem with prioritizing truth? On first glance, nothing. Predicting your experiences accurately is quite useful for getting more of the kinds of experiences you want, which is to say, winning. The problems arise when you over optimize for truth.

The trouble is that not all humans, let alone all agent-like things in the universe, are rational or truth seeking (they have other purposes that ground their thinking). This means that you're going to need some skill at reasoning about non-truth-seeking agents. But the human brain is kinda bad at thinking about minds not like our own. A highly effective way to overcome this is to build cognitive empathy for others by learning to think like them (not just to model them from the outside, but to run a simulated thought process as if you were them). But this requires an ability to prioritize something other than truth because the agent being simulated doesn't and because our brains can't actually firewall off these simulations cleanly from "our" "real" thoughts (cf. worries about dark arts, which we'll discuss shortly). Thus in order to accurately model non-truth-seeking agents, we need some ability to care about stuff other than seeking truth.

The classic failure mode of failing to accurately model non-truth-seeking agents, which I think many of us are familiar with, is the overly scrupulous, socially awkward rationalist or nerd who is very good at deliberate thinking and can reckon all sorts of plans that should work in theory to get them what they want, but which fall apart when they have to interact with other humans. Consequences: difficulties in dating and relationships, trouble convincing others about the biggest problems in the world, being incentivized to only closely associate with other overly scrupulous and socially awkward rationalists, etc.

This trap where seeking truth locks one out from the truth when attempting to model non-truth-seeking agents is pernicious because the only way to address it is to ease up on doing the thing you're trying to do: seek truth.

Please don't round off that last sentence to "give up truth seeking"! That's definitely not what I'm saying! What I'm saying is that trying too tightly to optimize for the truth Goodharts yourself on truth. Why? Two reasons. First, there's a gap where the objective of truth can't be better optimized for past some point because the problem of the criterion creates hard limits on truth seeking given the ungrounded foundation of our knowledge and beliefs. Second, you can't model all agents if you simulate them using your truth-seeking mind. So the only option (lacking future tech that would let us change how our brains work, anyway) is to ease off a bit and let in other concerns than truth.

But isn't this a dark art to be avoided? Maybe? The reality is that you are not yourself actually a truth-seeking-agent, no matter how much you want it to be so. Humans are not designed to optimize for truth, but are very good at deceiving themselves into thinking they are (or deceiving themselves of any number of things). We instead care about lots of things, like eating food, breathing, physical safety, and, yes, truth. But we can't optimize for any one of those things to the total exclusion of the others, because once we're doing our current best at winning our desires we can only trade off along the optimization curve. Past some point, trying to get more truth will only make each of us worse off overall, not better, even if we did each succeed in getting more truth, which I don't think we can get anyway due to Goodhart effects. It's not a dark art to accept we're human rather than idealized Bayesian agents with hyperpriors for the truth; it's just facing the world as we find it.

And to bring this around to my favorite topic, this is why the problem of the criterion matters: that you are not a perfectly truth concerned agent means you are grounding your knowledge in things other than truth, and the sooner you figure that out the sooner you can get on with more winning.

Thanks to Justis for useful feedback on an earlier draft of this essay via the LW feedback service.

25

8 comments, sorted by Click to highlight new comments since: Today at 3:27 PM
New Comment

As an arealist, I certainly can't disagree with your definition of truth, since it matches mine. In fact, I stated on occasion that tabooing true, say, by replacing with "accurate" where possible, is a very useful exercise.

The problem of criterion dissolves once you accept that you are an embedded agent with a low-fidelity model of the universe you are embedded in, including self. There is no circularity. Knowing how to know something is an occasionally useful step, but not essential for extracting predictions from the model of the universe, which is the agent's only action, sort of by definition. Truth is also an occasionally useful concept, but accuracy of predictions is what makes all the difference, including being able to model such parts of the world as other agents, with different world models. Knowledge is a bad term for accuracy of the model of the world, or as you said, "accurate predictions about our experiences". Accepting your place in the world as one of the multitude of embedded agents, with various internal models, who also try to (out)model you is probably one of the steps toward a more accurate model.

I agree with the existence of the failure mode and the need to model others in order to win, and also in order to be a kind person who increases the hedons in the world.

But isn't it the case that if readers notice they're good at "deliberate thinking and can reckon all sorts of plans that should work in theory to get them what they want, but which fall apart when they have to interact with other humans", they could add a <deliberately think about how to model other people> as part of their "truth" search and thereby reach your desired end point without using the tool you are advocating for?

In theory, yes. In practice this tends to be impractical because of the amount of effort required to think through how other people think in a deliberate way that accurately models them. Most people who succeed in modeling others well seem to do it by having implicit models that are able to model them quickly.

I think the point is that people are complex systems that are too complex to model well if you try to do it in a deliberate, system-2 sort of way. Even if you eventually succeed in modeling them, you'll likely get your answer about what to do way to late to be useful. The limitations of our brains force us to do something else (heck, the limitations of physics seem to force this, since idealized Solomonoff inductors run into similar computability problems, cf. AIXI).

This balance can be radically off kilter if an agent only has access to deliberate modelling. "Read books to understand how to deal with humans passingly" is a strategy seen in the wild for those that don't instinctively build strong implicit models.

What is truth? Rather than get into philosophical debates about this one, let’s use a reasonable working definition that by truth we mean “accurate predictions about our experiences

I would have though that definition was less impacted by the PotC than most. You can check directly that predictive theory is predicting, do you don't need apriori correctness.

This means that you’re going to need some skill at reasoning about non-truth-seeking agents. But the human brain is kinda bad at thinking about minds not like our own. A highly effective way to overcome this is to build cognitive empathy for others by learning to think like them (not just to model them from the outside, but to run a simulated thought process as if you were them). But this requires an ability to prioritize something other than truth because the agent being simulated doesn’t and because our brains can’t actually firewall off these simulations cleanly from “our” “real” thoughts (cf. worries about dark arts, which we’ll discuss shortly

If you're capable of firewalling off your model of non-truth-seekers, I don't see a further problem. You don't have to stop prioritising truth in order to model non-truth-seekers because prioritising truth requires modelling non-truth-seekers as non-truth-seekers.

This post was persuasively and efficiently articulated, so thank you. A handful of initial reactions:

  1. You seem to have anticipated this response. The definition you begin with—truth as "accurate predictions about our experiences"—is fairly narrow. One could respond that what you identify here are the effects of truth (presumably? but maybe not necessarily), while truth is whatever knowledge enables us to make these predictions. In any case, it doesn't seem self-evident that truth is necessarily concerned with making predictions, and I wonder how much of the argument hinges upon this strict premise. How would it alter if etc.

  2. Relatedly, you say that when we seek truth, "we want to know things that tell us what we’ll find as we experience the world." Rather than primarily aiming to predict in advance what we'll find, might we instead aim to know the things that enable us to understand whatever we actually do find, regardless of whether we expected it (or whether it is as we predicted it would be)? Maybe this knowledge amounts to the same thing in the end. I don't know.

  3. You refer to the thing outside of truth that grounds the quest for it as purpose. Would belief or faith be an acceptable substitute here?

  4. It would seem that [desire for] knowledge of truth already encompasses or takes into account the existence of non-truth-seeking agents and the knowledge requisite to accurately modeling them.

  5. Given your statement in the antepenultimate paragraph—"the reality is that you are not yourself actually a truth-seeking-agent, no matter how much you want it to be so"—this piece ultimately appears to be a reflection on self-knowledge. By encouraging the rigidly truth-obsessed dork to more accurately model non-truth-seeking agents, you are in fact encouraging him to more accurately model himself. So again, the desire for truth (as self-knowledge, or the truth about oneself) still guides the endeavor. (This was the best paragraph in the piece, I think.)

You seem to have anticipated this response. The definition you begin with—truth as "accurate predictions about our experiences"—is fairly narrow. One could respond that what you identify here are the effects of truth (presumably? but maybe not necessarily), while truth is whatever knowledge enables us to make these predictions. In any case, it doesn't seem self-evident that truth is necessarily concerned with making predictions, and I wonder how much of the argument hinges upon this strict premise. How would it alter if etc.

Not much. You could choose some other kind of truth definition if you like. My goal was to use a deflationary definition of truth in order to avoid stumbling into philosophical mindfields, and because I'm not committed to metaphysical realism myself so I'd be dishonest if I used such a definition.

Relatedly, you say that when we seek truth, "we want to know things that tell us what we’ll find as we experience the world." Rather than primarily aiming to predict in advance what we'll find, might we instead aim to know the things that enable us to understand whatever we actually do find, regardless of whether we expected it (or whether it is as we predicted it would be)? Maybe this knowledge amounts to the same thing in the end. I don't know.

I'd say that amounts to the same thing. There's some links in the post relevant to the case for this about Bayesianism and the predictive processing model of the brain.

You refer to the thing outside of truth that grounds the quest for it as purpose. Would belief or faith be an acceptable substitute here?

Maybe. "Purpose" is here a stand-in term for a whole category of things like what Heidegger called Sorge. Although not necessarily exhaustive, I wrote a post about this topic. I could see certain notions of belief and faith fitting in here.

It would seem that [desire for] knowledge of truth already encompasses or takes into account the existence of non-truth-seeking agents and the knowledge requisite to accurately modeling them.

As I think I addressed a couple points up, yes and humans are in the implementation formed such that this is insufficient.

Given your statement in the antepenultimate paragraph—"the reality is that you are not yourself actually a truth-seeking-agent, no matter how much you want it to be so"—this piece ultimately appears to be a reflection on self-knowledge. By encouraging the rigidly truth-obsessed dork to more accurately model non-truth-seeking agents, you are in fact encouraging him to more accurately model himself. So again, the desire for truth (as self-knowledge, or the truth about oneself) still guides the endeavor. (This was the best paragraph in the piece, I think.)

Seeking truth starts at home, so to speak. :-)

Many thanks for the reply. I realize it may be a tad absurd to question the centrality of prediction, especially given the context of these posts, so I appreciate the judicious responses + links.

Many of the discussions on this site deal in subjects \ frameworks of which I am largely ignorant, so I am here to learn (and have fun). Looking forward to reading your thoughts in the future : )