Tl;dr: the problem of how to make decisions using multiple (potentially incompatible) worldviews (which I'll call the problem of meta-rationality) comes up in a range of contexts, such as epistemic deference. Applying a policy-oriented approach to meta-rationality, and evaluating worldviews by the quality of their advice, dissolves several undesirable consequences of the standard "epistemic" approach to deference.
Meta-rationality as the limiting case of separate worldviews
When thinking about the world, we’d ideally like to be able to integrate all our beliefs into a single coherent worldview, with clearly-demarcated uncertainties, and use that to make decisions. Unfortunately, in complex domains, this can be very difficult. Updating our beliefs about the world often looks less like filling in blank parts of our map, and more like finding a new worldview which reframes many of the things we previously believed. Uncertainty often looks less like a probability distribution over a given variable, and more like a clash between different worldviews which interpret the same observations in different ways.
By “worldviews” I include things like ideologies, scientific paradigms, moral theories, perspectives of individual people, and sets of heuristics. The key criterion is that each worldview has “opinions” about the world which can be determined without reference to any other worldview. Although of course different worldviews can have overlapping beliefs, in general their opinions can be incompatible with those of other worldviews - for example:
- Someone might have severe uncertainty about far-reaching empirical claims, or about which moral theory to favor.
- A scientist might be investigating a phenomenon during a crisis period where there are multiple contradictory frameworks which purport to explain the phenomenon.
- Someone might buy into an ideology which says that nothing else matters except adherence to that ideology, but then feel a “common-sense” pull towards other perspectives.
- Someone might have a strong “inside view” on the world, but also want to partly defer to the worldviews of trusted friends.
- Someone might have a set of principles which guides their interactions with the world.
- Someone might have different parts of themselves which care about different things.
I think of “intelligence” as the core ability to develop and merge worldviews; and “rationality” as the ability to point intelligence in the most useful directions (i.e. taking into account where intelligence should be applied). Ideally we’d like to always be able to combine seemingly-incompatible worldviews into a single coherent perspective. But we usually face severe limitations on our ability to merge worldviews together (due to time constraints, cognitive limitations, or lack of information). I’ll call the skill of being able to deal with multiple incompatible worldviews, when your ability to combine them is extremely limited, meta-rationality. (Analogously, the ideal of emotional intelligence is to have integrated many different parts of yourself into a cohesive whole. But until you’ve done so, it’s important to have the skill of facilitating interactions between them. I won’t talk much about internal parts as an example of clashing worldviews throughout this post, but I think it’s a useful one to keep in mind.)
I don’t think there’s any sharp distinction between meta-rationality and rationality. But I do think meta-rationality is an interesting limiting case to investigate. The core idea I’ll defend in this post is that, when our ability to synthesize worldviews into a coherent whole is very limited, we should use each worldview to separately determine an overall policy for how to behave, and then combine those policies at a high level (for example by allocating a share of resources to each). I’ll call this the policy approach to meta-rationality; and I’ll argue that it prevents a number of problems (such as over-deference) which arise when using other approaches, particularly the epistemic approach of combining the credences of different worldviews directly.
Comparing the two approaches
Let’s consider one central example of meta-rationality: taking into account other people’s disagreements with us. In some simple cases, this is straightforward - if I vaguely remember a given statistic, but my friend has just looked it up and says I’m wrong, I should just defer to them on that point, and slot their correction into my existing worldview. But in some cases, other people have worldviews that clash with our own on large-scale questions, and we don’t know how to (or don’t have time to) merge them together without producing a frankenstein worldview with many internal inconsistencies.
How should we deal with this case, or other cases involving multiple inconsistent worldviews? The epistemic approach suggests:
- Generating estimates of how accurate we think each worldview’s claims are, based on its track record.
- Using these estimates to evaluate important claims by combining each worldview’s credences to produce our “all-things-considered” credence for that claim.
- Using our all-things-considered credences when making decisions.
This seems sensible, but leads to a few important problems:
- Merging estimates of different variables into all-things-considered credences might lead to very different answers depending on how it’s done, since after each calculation you lose information about the relationships between different worldviews’ answers to different questions. For example: worldview A might think that you’re very likely to get into MIT if you apply, but if you attend you’ll very likely regret it. And worldview B might think you have almost no chance of getting in, but if you do attend you’ll very likely be happy you went. Calculating all-things-considered credences separately would conclude you have a medium-good chance of a medium-good opportunity, which is much more optimistic than either worldview individually.
- One might respond that this disparity only arises when you’re applying the epistemic approach naively, where the non-naive approach would be to only combine worldviews' final expected utility estimates. But I think that the naive approach is actually the most commonly-used version of the epistemic approach - e.g. Ben Garfinkel talks about using deference to calculate likelihoods of risk in this comment; and Greg Lewis defends using epistemic modesty for “virtually all” beliefs. Also, combining expected utility estimates isn't very workable either, as I'll discuss in the next point.
- Worldviews might disagree deeply on which variables we should be estimating, with no clear way to combine those variables into a unified decision, due to:
- Differences in empirical beliefs which lead to different focuses. E.g. if one worldview thinks that cultural change is the best way to influence the future, and another thinks that technical research works better, it may be very difficult to convert impacts on those two areas into a "common unit" for expected value comparisons. Even if we manage to do so, empirical disagreements can lead one worldview to dominate another - e.g. if one falls prey to Pascal’s mugging, then its expected values could skew high enough that it effectively gets to make all decisions from then on, even if other worldviews are ignoring the mugging.
- Deep-rooted values disagreements. If worldviews don't share the same values, we can't directly compare their expected value estimates. Even if every worldview can formulate its values in terms of a utility function, there's no canonical way to merge utility estimates across worldviews with different values, for the same reason that there’s no canonical way to compare utilities across people: utility functions are equivalent up to positive affine transformations (multiplying one person’s utilities by 1000 doesn’t change their choices, but does change how much they’d influence decisions if we added different peoples’ utilities together.)
- One proposed solution is variance normalization, where worldviews' utilities are normalized to have the same mean and variance. But that can allow small differences in how we differentiate the options to significantly affect how a worldview’s utilities are normalized. (For example, “travel to Paris this weekend” could be seen as one plan, or be divided into many more detailed plans: “travel to Paris today by plane”, “travel to Paris tomorrow by train”, etc.) It's also difficult to figure out what distribution to normalize over (as I'll discuss later).
- Some worldviews might contain important action-guiding insights without being able to generate accurate empirical predictions - for example, a worldview which tells us that following a certain set of principles will tend to go well, but doesn’t say much about which good outcomes will occur.
- Some worldviews might be able to generate accurate empirical predictions without containing important action-guiding insights. For example, a worldview which says “don’t believe extreme claims” will be right much more often than it’s wrong. But since extreme claims are the ones most likely to lead to extreme opportunities, it might only need to be wrong once or twice for the harm of listening to it to outweigh the benefits of doing so. Or a worldview that has strong views about the world in general, but little advice for your specific situation.
- Since there are many other people whose worldviews are, from an outside perspective, just as trustworthy as ours (or more so), many other worldviews should be given comparable weight to our own. But then when we average them all together, our all-things-considered credences will be dominated by other people’s opinions, and we should basically never make important decisions based on our own opinions. Greg Lewis bites that bullet in this post, but I think most people find it pretty counterintuitive.
The key problem which underlies these different issues is that the epistemic approach evaluates and merges the beliefs of different worldviews too early in the decision-making process, before the worldviews have used their beliefs to evaluate different possible strategies. By contrast, the policy approach involves:
- Generating estimates of how useful we think each worldview’s advice is, based on its track record.
- Getting each worldview to identify the decisions that it cares most about influencing.
- Combining the worldviews’ advice to form an overall strategy (or, in reinforcement learning terminology, a policy), based both on how useful we think the worldview is and also on how much the worldview cares about each part of the strategy.
One intuitive description of how this might occur is the parliamentary approach. Under this approach, each worldview is treated as a delegate in a parliament, with a number of votes proportional to how much weight is placed on that worldview; delegates can then spread their votes over possible policies, with the probability of a policy being chosen proportional to how many votes are cast for it.
The policy approach largely solves the problems I identified previously:
- Since each worldview separately calculates the actions it recommends in a given domain, no information is lost by combining estimates of variables before using them to make decisions. In the college admissions example, each worldview will separately conclude “don’t put too much effort into college admissions”, and so any reasonable combined policy will follow that advice.
- The policy approach doesn’t require us to compare utilities across worldviews, since it’s actions not utilities that are combined across policies. Policies do need to prioritize some decisions over others - but unlike in the epistemic case, this doesn’t depend on how decisions are differentiated, since policies get to identify for themselves how to differentiate the decisions. (However, this introduces other problems, which I’ll discuss shortly.)
- Intuitively speaking, this should result in worldviews negotiating for control over whichever parts of the policy they care about most, in exchange for giving up control over parts of the policy they care about least. (This might look like different worldviews controlling different domains, or else multiple worldviews contributing to a compromise policy within a single domain.) Worldviews which care equally about all decisions would then get to make whichever decisions the other worldviews care about least.
- Worldviews which can generate good advice can be favored by the policy approach even if they don’t produce accurate predictions.
- Worldviews which produce accurate predictions but are overall harmful to give influence over your policy will be heavily downweighted by the policy approach.
- Intuitively speaking, the reason we should pay much more attention to our own worldview than to other people’s is that, in the long term, it pays to develop and apply a unified worldview we understand very well, rather than being pulled in different directions by our incomplete understanding of others’ views. The policy approach captures this intuition: a worldview might be very reasonable but unable to give us actionable advice (for example if we don’t know how to consistently apply the worldview to our own situation). Under the policy approach, such worldviews either get lower weight in the original estimate, or else aren’t able to identify specific decisions they care a lot about influencing, and therefore have little impact on our final strategy. Whereas our own worldview tends to be the most actionable, especially in the scenarios where we’re most successful at achieving our goals.
I also think that the policy approach is much more compatible with good community dynamics than the epistemic approach. I’m worried about cycles where everyone defers to everyone else’s opinion, which is formed by deferring to everyone else’s opinion, and so on. Groupthink is already a common human tendency even in the absence of explicit epistemic-modesty-style arguments in favor of it. By contrast, the policy approach eschews calculating or talking about all-things-considered credences, which pushes people towards talking about (and further developing) their own worldviews, which has positive externalities for others who can now draw on more distinct worldviews to make their own decisions.
Problems with the policy approach
Having said all this, there are several salient problems with the policy approach; I’ll cover four, but argue that none of them are strong objections.
Firstly, although we have straightforward ways to combine credences on different claims, in general it can be much harder to combine different policies. For example, if two worldviews disagree on whether to go left or right (and both think it’s a very important decision) then whatever action is actually taken will seem very bad to at least one of them. However, I think this is mainly a problem in toy examples, and becomes much less important in the real world. In the real world, there are almost always many different strategies available to us, rather than just two binary options. This means that there’s likely a compromise policy which doesn’t differ too much from any given worldview’s policy on the issues it cares about most. Admittedly, it’s difficult to specify a formal algorithm for finding that compromise policy, but the idea of fairly compromising between different recommendations is one that most humans find intuitive to reason about. A simple example: if two policies disagree on many spending decisions, we can give each a share of our overall budget and let it use that money how it likes. Then each policy will be able to buy the things it cares about most: getting control over half the money is usually much more than twice as valuable as getting control over all the money.
Secondly, it may be significantly harder to produce a good estimate of the value of each worldview’s advice than the accuracy of each worldview’s predictions, because we tend to have much less data about how well personalized advice works out. For example, if a worldview tells us what to do in a dozen different domains, but we only end up entering one domain, it’s hard to evaluate the others. Whereas if a worldview makes predictions about a dozen different domains, it’s easier to evaluate all of them in hindsight. (This is analogous to how credit assignment is much harder in reinforcement learning than in supervised learning.)
However, even if in practice we end up mostly evaluating worldviews based on their epistemic track record, I claim that it’s still valuable to consider the epistemic track record as a proxy for the quality of their advice, rather than using it directly to evaluate how much we trust each worldview. For example: suppose that a worldview is systematically overconfident. Using a direct epistemic approach, this would be a big hit to its trustworthiness. However, the difference between being overconfident and being well-calibrated plausibly changes the worldview’s advice very little, e.g. because it doesn’t change that worldview’s relative ranking of options. Another example: predictions which many people disagree with can allow you to find valuable neglected opportunities, even if conventional wisdom is more often correct. So when we think of predictions as a proxy for advice quality, we should place much more weight on whether predictions were novel and directionally correct than whether they were precisely calibrated.
Thirdly, the policy approach as described thus far doesn’t allow worldviews to have more influence over some individuals than others - perhaps individuals who have skills that one worldview cares about far more than any other; or perhaps individuals in worlds where one worldview’s values can be fulfilled much more easily than others’. Intuitively speaking, we’d like worldviews to be able to get more influence in those cases, in exchange for having less influence in other cases. In the epistemic approach, this is addressed via variance normalization across many possible worlds - but as discussed above, this could be significantly affected by how you differentiate the possibilities (and also what your prior is over those worlds). I think the policy approach can deal with this in a more principled way: for any set of possible worlds (containing people who follow some set of worldviews) you can imagine the worldviews deciding on how much they care about different decisions by different people in different possible worlds before they know which world they’ll actually end up in. In this setup, worldviews will trade away influence over worlds they think are unlikely and people they think are unimportant, in exchange for influencing the people who will have a lot of influence over more likely worlds (a dynamic closely related to negotiable reinforcement learning).
This also allows us a natural interpretation of what we’re doing when we assign weights to worldviews: we’re trying to rederive the relative importance weights which worldviews would have put on the branch of reality we actually ended up in. However, the details of how one might construct this “updateless” original position are an open problem.
One last objection: hasn’t this become far too complicated? “Reducing” the problem of epistemic deference to the problem of updateless multi-agent negotiation seems very much like a wrong-way reduction - in particular because in order to negotiate optimally, delegates will need to understand each other very well, which is precisely the work that the whole meta-rationality framing was attempting to avoid. (And given that they understand each other, they might try adversarial strategies like threatening other worldviews, or choosing which decisions to prioritize based on what they expect other worldviews to do.)
However, even if finding the optimal multi-agent bargaining solution is very complicated, the question that this post focuses on is how to act given severe constraints on our ability to compare and merge worldviews. So it’s consistent to believe that, if worldviews are unable to understand each other, they’ll do better by merging their policies than merging their beliefs. One reason to favor this idea is that multi-agent negotiation makes sense to humans on an intuitive level - which hasn’t proved to be true for other framings of epistemic modesty. So I expect this “reduction” to be pragmatically useful, especially when we’re focusing on simple negotiations over a handful of decisions (and given some intuitive notion of worldviews acting “in good faith”).
I also find this framing useful for thinking about the overall problem of understanding intelligence. Idealized models of cognition like Solomonoff induction and AIXI treat hypotheses (aka worldviews) as intrinsically distinct. By contrast, thinking of these as models of the limiting case where we have no ability to combine worldviews naturally points us towards the question of what models of intelligence which involve worldviews being merged might look like. This motivates me to keep a hopeful eye on various work on formal models of ideal cognition using partial hypotheses which could be merged together, like finite factored sets (see also the paper) and infra-bayesianism. I also note a high-level similarity between the approach I've advocated here and Stuart Armstrong's anthropic decision theory, which dissolves a number of anthropic puzzles via converting them to decision problems. The core insight in both cases is that confusion about how to form beliefs can arise from losing track of how those beliefs should relate to our decisions - a principle which may well help address other important problems.