Towards a Formal Scientific Epistemology

Richard_Ngo

In my post “Why I’m not a Bayesian”, I argued that the Bayesian approach of assigning credences to propositions with binary truth values only works in simple and restricted domains. Instead, I claimed, a better approach to epistemology is to assign degrees of truth to models of the world.

This approach is broadly inspired by science, which is the domain from which we have the most evidence about which epistemological practices allow us to solve very hard problems. We don’t currently have a complete theory of scientific epistemology, but we can identify some important differences between scientific epistemology and Bayesian epistemology. Central examples of Bayesian epistemology (such as Solomonoff induction) assume that the truth lies within the class of hypotheses being considered. Conversely, in central examples of scientific research, the truth is definitely not already under consideration: the main problem is to come up with any hypothesis that explains existing data.

Another way of putting this point: Bayesian epistemology is entirely about empirical updates, whereas science is mostly about the process of constructing new theories. In some cases, once you’ve constructed a theory, you can be confident that it’s close to the truth merely from how well it fits existing data. But scientific theories are only fully accepted after they make successful advance predictions. That’s another difference compared with Bayesian epistemology, which treats retrodictions as equivalent to predictions.

In general I think scientific epistemology is far superior as a guide for thinking about difficult problems (like AI alignment) than Bayesian epistemology. However, scientific epistemology has mostly been described informally—e.g. by Popper, Kuhn, Feyerabend, etc. Popper did attempt to formally define a metric for degrees of truth, but it wasn’t very successful. I’d like to be able to describe scientific epistemology as formally as we can describe Bayesian epistemology (and ideally to unify them in a single framework).

I think that Garrabrant induction (also known as logical induction) is a major step towards formalizing scientific epistemology. This is an update compared with my position in my previous post, in which I critiqued Garrabrant induction in passing for its focus on assigning credences to propositions. However, in the process of assigning credences to propositions, Garrabrant induction also assigns something like degrees of truth (which it calls “wealth”) to something like models of the world (which it calls “traders”). So my critique was pretty off-base, in a way which I’m surprised nobody called out in the LessWrong comments. (Indeed, I’d even identified some of Garrabrant induction’s nice properties in a previous comment. This is a useful lesson on the pitfalls of writing posts about what you’re against rather than what you’re for.)

The key idea of Garrabrant induction is a market mechanism which sets credences for logical statements (including statements about the Garrabrant inductor itself) as the prices in a prediction market on whether those statements will eventually be proved. The traders in this prediction market are simply all polynomial-time algorithms, iteratively enumerated and given some starting wealth. Traders who are more successful will end up with more wealth, giving them greater power to move market prices.

Traders share a number of properties with scientific theories (which Bayesian hypotheses lack). At each point in time, most traders/theories aren’t yet under consideration. The ones that are under consideration don’t need to make predictions about everything that happens—instead, they can focus on making whichever predictions are most surprising and novel. Also, unlike Bayesian hypotheses, traders/theories aren’t mutually exclusive: an ideal reasoner would have many of them focusing on different domains.

While Garrabrant induction was formulated as a way of predicting mathematical theorems, we can imagine the same algorithm predicting a stream of input data about the world. What else would that version of Garrabrant induction need to be a good formal theory of scientific epistemology? Four things seem most prominent:

The ability for old evidence to support new theories.
The difference between traders and models.
The ability to modify traders.
The difference between wealth and degree of truth.

Abram Demski already touched on many of these points in this post and others in the same sequence. I don’t claim much novelty here, but for some reason it took me a very long time to fit Garrabrant induction into the “replacement for Bayesian epistemology” slot in my ontology—perhaps because it was originally framed more as an extension of Bayesianism than a replacement for it.^[1] So further elaboration of this perspective seems helpful.

The problem of old evidence

Garrabrant induction and Solomonoff induction take very different positions on the problem of old evidence. In Solomonoff induction, there’s no distinction between old evidence and new evidence—they’re treated symmetrically. Whereas in Garrabrant induction, traders only ever gain wealth from predicting new evidence—retrodictions of old evidence are irrelevant.

Scientific epistemology takes a middle ground. Advance predictions of new evidence are weighted much more highly than retrodictions, but old evidence can still support a theory. Intuitively speaking, one reason why retrodictions should be discounted is that a theory might have been designed with that old evidence in mind, and therefore crediting it with predicting that evidence is a kind of overfitting or double counting.

Solomonoff induction doesn’t care about this, because it has a mechanism for preventing overfitting: assigning more complex hypotheses lower prior probability. One extra bit of description length might “smuggle in” information which allows the hypothesis to predict old evidence, but it’ll also halve the prior probability of that hypothesis. And if a hypothesis can more than double its probability relative to other hypotheses using just one extra bit, then it must be compressing information more efficiently, which is actually what we want.

In scientific epistemology, however, we don’t have any clear way of measuring the complexity of a given hypothesis, since it’s implemented within a big messy neural network. Even when the theory is described by precise equations, using those equations to make predictions requires the use of “auxiliary hypotheses” in which it’s often possible to hide a lot of complexity. And so in general it’s not possible to mechanistically penalize hypotheses for “smuggling in” old evidence.

However, it seems like this is the kind of thing that Garrabrant induction traders could take into account if given enough information about each other. This seems related to the concept of trading under adverse selection. In normal financial markets, other traders sometimes know more than you. So when market-making you need to set bid-ask spreads, because the expected value of a stock conditional on someone buying it from you is higher than the expected value of a stock conditional on someone selling it to you.

The implementation details do seem tricky, though. In Solomonoff induction, when you add a new hypothesis you can just go back and evaluate how it would have done on all the old evidence, which is equivalent to it having been there all along. But in Garrabrant induction, predictions move the market prices, and so intuitively it seems like you’d need to rerun the whole market. It’s also unclear how traders should be made aware that some of their competitors are “from the future”. It seems like we might need to bake in some notion of situational awareness, which seems complicated. (For more on this, see this post by Abram.)

Traders vs models

Scientific theories need to make predictions, but there’s no standard way to translate those predictions into bets. By contrast, traders in Garrabrant induction need to make bets, but those bets need not be driven by predictions. Traders in Garrabrant induction are any (and eventually every) polynomial-time algorithm. These include very simple algorithms, like ones which notice when “A OR B” is cheaper than either A or B individually, then arbitrage that difference, without having any opinion on how likely A and B actually are. (Analogously, many human actions are driven by reflexes or heuristics rather than explicit beliefs about what outcomes those actions will cause.)

However, in the long term it seems likely that the biggest wins will accrue to traders which implement models containing important insights that other traders lack, then bet that those models are right. This seems particularly true if we focus on the domain of science. Yet those traders might still use a wide range of trading strategies to convert their internally-represented beliefs into actual bets. It would be nice if we could demonstrate that almost all wealth will eventually accrue to traders which use a given kind of trading strategy (e.g. Kelly betting).

Modifying traders

Traders in Garrabrant induction are generated by enumerating every polynomial-time algorithm in order of simplicity. However, an important part of scientific epistemology is the process of identifying which new theories to consider next, especially via improving existing theories. In one sense, traders can already improve via taking their trading history into account when making new trades. But it would be nice if this were more continuous with the process of adding new traders.

One way to augment Garrabrant induction to account for the process of theory design would be if the existing traders could influence which new traders are added each day. But that doesn’t quite capture what we want, because in scientific epistemology new models evolve from old models and inherit much of their credibility. A theory that has one wrong belief can still be patched in a way which allows it to retain most of the credit for its previous correct predictions. So perhaps traders could be allowed to “donate” their wealth to other traders. More generally, if traders are allowed to invest in each other, then this allows them to represent higher-level concepts composed of the concepts represented by other traders, without needing to reimplement those same concepts internally.

However, all of this makes the concept of “trading strategies” much more complicated—now it’s about relationships between different traders. And I’m uncertain which of these suggestions are adding important innovations, versus adding unimportant details that the original formulation of Garrabrant induction correctly abstracted away.

Wealth vs truth

Making a bunch of wealth certainly suggests that a trader has an approximately-true model of the world. But the key difficulty with interpreting wealth as degree of truth is that wealth is rivalrous, whereas degree of truth isn’t. If a mostly-true theory reallocated its wealth between many slightly-different variants of itself, all of them would still be mostly true, but each of them would have much less wealth. More generally, gaining the most wealth requires betting against the consensus, and so contrarian traders might outcompete conformist traders even if they’re less correct overall than any given conformist. We could try to group traders into clusters, and talk about the degree of truth of each cluster—however, that just moves the same problem to a higher level.

When we face difficulties in defining a concept like degrees of truth, a useful question is “what do we want to use the concept for?” One answer is that traders whose models are more true should get more influence over our actions (given some mechanism for hooking up a logical inductor’s outputs to actions, which I’ll leave unspecified here). But this is still a rivalrous criterion, because our actions need to be determined by some set of traders. However, a less rivalrous version of this answer is that a model’s degree of truth affects how much we trust it to influence our actions relative to non-model-based policies. This seems to intuitively track scientific epistemology. When even our best theories of a phenomenon are quite bad, we’d prefer to rely on intuitions, habits, or traditions that have worked in the past (even when we don’t know why they work). Conversely, when we’re confident that our best theories are very close to the truth, we’re willing to follow even very counterintuitive recommendations from them.

I don’t know how to pin down the distinction between model-based and model-free traders, but it seems related to the concept of gears-level understanding. Eliezer also discussed some related points in this post (see also my comment beneath it).

^{^}
For example, in this post Abram identifies some ways that understanding Garrabrant induction should change how Bayesians think about hypotheses. But Bayesian hypotheses are so different from Garrabrantian traders that using the same term for both seems misleading. In particular, the former are assigned credences, while the latter aren’t! That’s a much more fundamental change than the ones Abram identifies.

(On reflection, this is more a semi-redundant riff on what you already wrote, and less a good responsive comment, but oh well, I already wrote it so I guess I'll hit publish.)

When I think about the challenges with applying Solomonoff induction in practice, which the scientific method was designed around, I see two main things.

The first, as you point out, is that hypotheses are sufficiently modular that we CAN develop them piecemeal, and sufficiently complicated that we MUST develop them piecemeal. Thus, scientific hypotheses (being just one modular piece of a "real" hypothesis) can "remain agnostic" about certain observations. As I joked here, "if you treat The Law Of Conservation Of Energy as a “hypothesis”, and you “ask” Conservation Of Energy what the half-life of tritium is, then Conservation Of Energy will tell you “Huh? How should I know? Why are you asking me?”" This property (that hypotheses can be agnostic about things) is also characteristic of logical induction / prediction markets (as you point out), and infra-Bayesianism has that property too.

The second is that parsimony / Occam's razor / Solomonoff prior is central to finding the truth, but scientists range from being imperfect at assessing the complexity-vs-parsimony of a theory, to being atrociously bad at it. So the scientific enterprise is set up to rely as little as possible at complexity-assessments. Thus, as you point out, if we could perfectly assess the complexity-vs-parsimony of a hypothesis, then there would be no need to treat prediction and retrodiction differently. The retrodiction problem is about putting too many bits into the hypothesis, and it's only a problem because people are lousy at taking a theory and reading out the number of bits in it (i.e. they don't notice epicycles and special pleading). So again, the scientific institution is set up to minimize reliance on complexity-assessments. But that minimum reliance is still higher than zero. You just can't get away from it entirely. Even "data" is not theory-free, because you need theory to get from "raw" data to so-called observations.

The results are different in different fields (and sometimes pathological), as a field might or might not equilibrate to a state where practitioners with the sharpest discernment of complexity-vs-parsimony command the most respect and power and sway. See my comments here & here for hot-take examples from real-world academic fields.

So anyway, if I were working on this project, the first thing I would try is to say that the "ideal" is Solomonoff induction searching for a true hypothesis which happens to be modular (i.e. it has different pieces covering different, cleanly-separable domains), and then introduce a constraint that you can only measure bits-of-complexity with an extremely noisy ruler, and try to judge truth-seeking setups like LI etc. by how well they approximate the Solomonoff ideal under those assumptions / constraints. (...But I dunno, I didn't think about it very hard.)

Parsimony should be understood as merely a heuristic for how well a model could have predicted held out data. For example, the AIC approach to penalizing model complexity in statistical modeling is asymptotically equivalent to leave-one-out cross-validation for model selection. This Stone (1977) result should be understood as an explanation for why parsimony seems to be related to truth: post hoc fit of a parsimonious model is mathematically related to how well the model could have predicted held out data. Whereas parsimony has no direct epistemic relevance, predictive accuracy is the actual goal. Since the latter is what we care about, why not just consider predictive ability directly?

When I think about the challenges with applying Solomonoff induction in practice, which the scientific method was designed around,

It's hardly credible that the scientific method dating back about two hundred years , was consciously designed around Solomonof induction, published 1960ish, and dealing with computer programmes,which didn't even exist until the mid twentieth century.

Ray Solomonoff's definition doesn't suggest that SI, even if you could overcome the computational limitations, is a general purpose truth finder.

"Solomonoff's theory of inductive inference is a mathematical proof that if a universe is generated by an algorithm, then observations of that universe, encoded as a dataset, are best predicted by the smallest executable archive of that dataset"

So , to start with , it only works in computable universes.

on top of that, the relationship between a programme, a list of instructions, and a hypothesis as a description of reality, is not clear.

It's easy to see how equivalence of programmes hypotheses holds , when hypothesis is used in the sense of. Bayesian hypothesis, that only predicts expect s future observations : the output of a candidate programme is the expected future observations. But the strong claims about SI require to do more than prediction, to tell you what reality is. But programmes are lists of commands ,such as "add one to register A" . They are imperative, not descriptive. So how do they make statements about the external world? Indeed, an SI guaranteed to be wrong about reality if you run it in a simulated world: it will find the programme that generates the simulation but won't notice it is a simulation. This problem is hard to see for some people because of the ambiguity of terms.like "true", "model" , "hypothesis" .. none of them make it clear whether what is being talked.about is predictive accuracy, or correspondence to reality.

I suppose you could be using Yudkowsky 's version of Solomonoff as a metaphor for science -- formulating. and testing hypotheses. Not a good metaphor, because it elides the importance of hypothesis generation -- Science cannot find the truth without good hypotheses, and hypothesis formation is not a blind mechanistic process. As Richard rightly says.

"If every PhD in fundamental physics had contributed even one bit of usable evidence about how to unify quantum physics and general relativity, we’d have solved quantum gravity many times over by now. But we haven’t, because almost all of the work of science is in constructing sophisticated models, which Bayesianism says almost nothing about. (Formalisms like Solomonoff induction attempt to sidestep this omission by enumerating and simulating all computable models, but that’s so different from what any realistic agent can do that we should think of it less as idealized cognition and more as a different thing altogether, which just happens to converge to the same outcome in the infinite limit.)"

The second is that parsimony / Occam’s razor / Solomonoff prior is central to finding the truth, but scientists range from being imperfect at assessing the complexity-vs-parsimony of a theory, to being atrociously bad at it.

What are some examples of atrociously bad estimates of parsimony?

So the scientific enterprise is set up to rely as little as possible at complexity-assessments.

Is it? I can't say I have noticed.

So anyway, if I were working on this project, the first thing I would try is to say that the “ideal” is Solomonoff induction searching for a true hypothesis

It's all very well to talk about using SI to search for truth, but it's not clear that it is capable of doing that, since all it is doing directly is rejecting programmes that don't match observation,.and there is more to truth than matching observation.

Science students are always taught that that empirical testing is the hallmark of scientific truth, and are usually taught that Science delivers truths about reality -- instrumentalism and anti realism being minority interests. But correspondence cannot be observed and is not tested directly. Naive scientism is naive because it has failed to notice the problem "Just look" is the first step in the scientific method, not the whole thing.

There must be some relationship between predictive accuracy and ultimate truth. Well, there is an obvious one ,and it's the fact that a nonpredictive theory can't be true. But it doesn't have the corollary that a more predictively accurate theory is more correspondent. Ontologically wrong theories can be very accurate.

For instance, the Ptolemaic system can be made as accurate as you want for generating predictions, by adding extra epicycles ... although it is false, in the sense of lacking ontological accuracy, since epicycles don't exist. In fact, the more epicycles you add, the more accurate the model gets, and the less truthful to reality

Scientific theories minimally predict observations. Figuring out what the nature of the observed phenomenon is,is another matter. Induction can tell you the sun will rise in the east, but not that it is a fusion reactor. inference to the best explanation can tell you it is a fusion reactor, but leave fundamental ontological problems, like "what is a quark really", unsolved. Reductionism.is a blessing and a curse -- the curse is that when you reach the lowest level , you can no longer answer a "what is an X" question by specifying a bunch of components and their structure.

The problem of interpreting a fundamental theory is the problem of finding its ontological (or metaphysical) implications (including the option of treating some of its features as bookkeeping devices or otherwise in the map but the territory). We don't live in the most convenient universe, the one where there is always a clinching difference in predictions. The persistence of the problem of interpreting quantum mechanics shows that.

@Tim H

Parsimony should be understood as merely a heuristic for how well a model could have predicted held out data

Why? Since predictive accuracy underdetermines metaphysical truth, there is a need from something additional to bridge them, then parsimony could well fit the bill

"we’d have solved quantum gravity many times over by now"

We actually have, in the sense that we have discovered many forms of string theory and they are all consistent theories of quantum gravity.

However, scientific epistemology has mostly been described informally—e.g. by Popper, Kuhn, Feyerabend, etc.

Isn't both a key aspect of Kuhn and Feyerabend that there's not one scientific epistemology but that different fields use different epistemology and that this is fine?

I think it's pretty bad idea to call your idea of epistemology "scientific epistomology" as it suggests there a single one used more broadly.

When we face difficulties in defining a concept like degrees of truth, a useful question is “what do we want to use the concept for?” One answer is that traders whose models are more true should get more influence over our actions (given some mechanism for hooking up a logical inductor’s outputs to actions, which I’ll leave unspecified here).

This seems to ignore that there are many true models that are useless. Generally, in science you need not only to say something true but also something that's seen as valuable by the readers of the journal in which you want to publish. While a findings that both novel and surprising is more likely to be valuable to readers, I don't think it's sufficient.

Nice post! I broadly agree/endorse your thoughts, but would like to focus on a technical disagreement:

I think that the abstract concept of wealth as you discuss it doesn't map well onto the 'wealth' that Garrabrant Inductors give traders. The construction algorithm given in the paper does 'budget' each trader, giving an upper bound on how much money that trader can lose in a given day. However, the 'wealth' assignments the traders get is arbitrary. More concretely, the subroutine "TradingFirm" can seemingly be built off of any computable enumeration of traders. The Inductor 'doesn't care' which traders get what part of the budget; it only cares that the traders get budgeted somehow. This suggests to me that the assignment of 'degrees of truth' or 'wealth' isn't too conceptually important for Garrabrant Induction (since any assignment works).

Ideally, we'd be able to say that Inductors 'give' wealth to successful traders as a way to encode trusted hypotheses, inductive biases, or priors. As it stands, however, Inductors lack the structure to justify such rich interpretations. This is related to how Inductors explicitly price in (and must compute) every trader on every day. They never truly rule out any trader based on wealth, which deprives them of the sort of cognitive structure that arises from eliminating hypotheses based on heuristics.

BRIAs are more semantically rich in this sense. For instance, the decision auction used to construct the agent disqualifies 'broke' hypotheses from even being computed. This makes the wealth-based dynamics you describe in the post map much better onto BRIAs. Indeed, I think that there's valuable work to be done in modifying Garrabrant Inductors to describe acquired knowledge through the budgeting of traders. Relatedly, Garrabrant has previously written about how Inductors that can tamper with their own deductive process (through action) can get trapped in the 5 and 10. He frames this as a problem, but to me it signifies that Inductors do have some latent ability to express strong inductive biases – this is a desirable feature for descriptive notions of bounded rationality.

Also, unlike Bayesian hypotheses, traders/theories aren’t mutually exclusive: an ideal reasoner would have many of them focusing on different domains. Because of this,

This paragraph ends in the middle of a sentence. Did you have something more that you wanted to say here?

Ooops, good catch. I was going to say "because of this, you can't assign credences to them which sum to one". But I'm not sure that this is quite correct—I think there's a deeper reason why it's hard to assign credences to them at all (because traders/theories are only ever approximately correct). So I've just deleted the sentence fragment.

Truth is multi-faceted notion, when it comes to its relationship with science. Going with the standard structural realism arguments, there’s a) ontology (entities that exist); b) relationships / structure (between entities).

The ontology in our theories is regularly usurped by newer theories (we don’t believe in ether or infinite space time of Newton anymore so we should expect ontology of current theories to not hold true as well). But something does get preserved and it’s the relationship between coarse grained entities (F=ma will more or less hold true even if our notion for what mass is changes over time).

The implication of this distinction in any epistemology is that the “true” reality will not be — could not be —revealed to us. We only have observations to rely on. What we can expect to model is observed regularities. What models get adopted is sociological phenomena to a great extent but the scientific community weighs risky, precise predictions that turn out to be true a lot more because it’s an extremely strong evidence of the model having captured some aspect/view of true regularities of the underlying reality. (But, of course, future prediction is not the only criteria - models get adopted for all sorts of reasons including unification of disparate observations).

Overall, I’m unsure if the notion of truth is needed at all. What we need is useful models and interpretable, extrapolative models are more useful than black box models.