In this post, I would like to share a simple observation of mine which is about deriving some crude form of Occam's razor from a small set of assumptions.

Assumptions:

  1.  Subjective Bayesian probability is used to model an agent's degree of belief in hypotheses.
  2. The set of hypotheses  that we assign beliefs is countably infinite.
  3. Complexity of a hypothesis is represented with a positive integer where bigger integers represent more complex hypotheses.
  4. For any  there exists finitely many hypotheses in  that have the same complexity as .

When we accept these assumptions it is easy to prove the following statement.

For any  there are only finitely many hypotheses in  that are more complex and more probable or less complex and less probable than .

Proof.

The set of all hypotheses which are more probable than  is finite, since otherwise probabilities cannot sum to 1. The set of all hypotheses that are simpler than  is also finite by assumptions 3 and 4. Intersection of a finite set with another set is a finite set, and union of two finite sets is a finite set. 

 

Note that this result is independent from the choice of function that assigns beliefs to hypotheses. In addition, while it assumes some properties about the notion of simplicity, it does not require a full formalization of it.

 

Edit 1:

Assumption 3 can be thought of as two distinct assumptions combined together:

  1. The one-sided infinite nature of complexity, that is, there is always a more complex hypothesis while there exists "the simplest level".
  2. The finite gradation of complexity, that is, for any hypothesis , there exists a hypothesis  such that there is no hypothesis which is more complex than  and less complex than .

As a remark, the notion of Kolmogorov complexity (the length of the shortest computer program that generates an object) satisfies these properties.

 

Edit 2:

I no longer think this is a valuable observation, since it seems to be more about the shortcomings of mathematical tools we use than something fundamental. 

The sentence "The set of all hypotheses which are more probable than  is finite" is true only when we insist on using a complete event space (powerset of sample space) and a probability measure assigning real numbers to events. The proof is no longer valid if we introduce infinitesimals for instance.

New Comment