Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Generalised models as a category

10sj9999

2Stuart_Armstrong

4DavidHolmes

2Stuart_Armstrong

3Morgan_Rogers

2Stuart_Armstrong

1Koen.Holtman

2Stuart_Armstrong

3Koen.Holtman

New Comment

9 comments, sorted by Click to highlight new comments since: Today at 1:46 AM

Re "I'm not fully sold on category theory as a mathematical tool", if someone (e.g. me) were to take the category you've outlined and run with it, in the sense of establishing its general structure and special features, could you be convinced? Are there questions that you have about this category that you currently are only able to answer by brute force computation from the definitions of the objects and morphisms as you've given them? More generally, are there variants of this category that you've considered that it might be useful to study in parallel?

Cross reference: I am not a big fan of stating things in category theory notation, so I made some remarks on the building and interpretation of generalised models in the comment section of this earlier post on model splintering.

## Naming the "generalised" models

In this post, I'll apply some mathematical rigour to my ideas of model splintering, and see what they are as a category

^{[1]}.And the first question is... what to call them? I can't refer to them as 'the models I use in model splintering'. After a bit of reflection, I decided to call them 'generalised models'. Though that's a bit vague, it does describe well what they are, and what I hope to use them for: a formalism to cover all sorts of models.

## The generalised models

A generalised model M is given by three objects:

M=(F,E,Q).

Here F is a set of

features. Each feature f consists of a name or label, and a set in which the feature takes values. For example, we might have the feature "room empty?" with values "true" and "false", or the feature "room temperature?" with values in R+, the positive reals.We allow these features to sometimes take no values at all (such as the above two features if the room doesn't exist) or multiple values (such as "potential running speed of person X" which includes the maximal speed and any speed below it).

Define ¯¯¯f as the set component of the feature, and ¯¯¯¯¯F as disjoint union of all the sets of the different features - ie ¯¯¯¯¯F=⊔f∈F¯¯¯f.

A world, in the most general sense, is defined by all the values that the different features could take (including situations where features take multiple values and none at all). So the set of worlds, W, is the set of functions from ¯¯¯¯¯F to {0,1}, with 1 representing the fact that that feature takes that value, and 0 the opposite. Hence W=2¯¯¯¯F, the power set of ¯¯¯¯¯F.

The set of environments is a specific subset of these worlds: E⊂W. The choice of E is actually more important than that of W, as that establishes which values of the features we are modelling.

The Q is a partial probability distribution. In general, we won't worry as to whether Q is normalised (ie whether Q(E)=1) or not; we'll even allow Qs with Q(E)>1. So Q could be more properly be defined as a partial weight distribution. As long as we consider terms like Q(A∣B), then the normalisation doesn't matter.

## Morphisms: relations

For simplicity, assume there are finitely many features taking values in finite sets, making all sets in the generalised model finite.

If M0=(F0,E0,Q0) and M1(F1,E1,Q1) are generalised models, then we want to use binary relations between E0 and E1 as morphisms between the generalised models.

Let r be a relation between E0 and E1, written as e0∼re1. Then it defines a map r:2E0→2E1 between subsets of E0 and E1. This map is defined by e1∈r(E0) iff there exists an e0∈E0 with e0∼re1. The map r−1:2E1→2E0 is defined similarly

^{[2]}, seeing r−1 as the inverse relation, e0∼re1 iff e1∼r−1e0.We say that the relation r is a morphism between the generalised models if, for any E0⊂E0 and E1⊂E1:

The intuition here is that probability flows along the connections: if e0∼re1 then probability can flow from e0 to e1 (and vice-versa). Thus r(E0) must have picked up all the probability that flowed out of E0 - but it might have picked up more probability, since there may be connections coming into it from outside E0. Same goes for r−1(E1) and the probability of E1.

## Morphisms properties

We now check that these relations obey the requirements of morphisms in category theory.

Let r be a morphism M0→M1 (ie a relation between E0 and E1), and let q be a morphism M1→M2 (ie a relation between E1 and E2).

We compose relations by the composition of relations: e0∼pre2 iff there exists an e1 with e0∼re1 and e1∼pe2. Composition of relations is associative.

We now need to show that qr is a morphism. But this is easy to show:

Finally, the identity relation IdE0 is the one that relates a given e0∈E0 only to itself; then r and r−1 are the identity maps on 2E0, and the morphism properties for Q0=Q1 are trivially true.

So define the category of generalised models as GM.

## r-stable sets

Say that a set E0⊂E0 is r-stable if r−1r(E0)=E0.

For such an r-stable set, Q0(E0)≤Q1(r(E0)) and Q1(r(E0))≤Q0(r−1r(E0))=Q0(E0), thus Q0(E0)=Q1(r(E0)).

Hence if r is a morphism, it preserves the probability measure on the r-stable sets.

In the particular case where r is a bijective function, all points of E0 are r-stable (and all points of E1 are r−1-stable), so it's an isomorphism between E0 and E1 that forces Q0=Q1.

## Morphism example: probability update

Suppose we wanted to update our probability measure Q0, maybe by updating that a particular feature f takes a certain value x.

Then let Ef=x⊂E0 be the set of environments where f takes that value x. Then updating on f=x is the same as restricting to Ef=x and then rescaling.

Since we don't care about the scaling, we can consider updating on f=x as just restricting to Ef=x. This morphism is given by:

## Morphism example: surjective partial function

In my previous posts I defined how M1=(F1,E1,Q1) could be a refinement of M0=(F0,E0,Q0).

In the language of the present post, M1 is a refinement of M0 if there exists a generalised model M′1=(F1,E1,Q′1) and a surjective partial function r:E1→E0 (functions and partial functions are specific examples of binary relations) that is a morphism from M′1 to M0. The Q1 is required to be potentially 'better' than Q′1 on E1, in some relevant sense.

This means that M1 is 'better' than M0 in three ways. The r is surjective, so E1 covers all of E0, so its set of environments is at least as detailed. The r is a partial function, so E1 might have even more environments that don't correspond to anything in E0 (it considers more situations). And, finally, Q1 is better than Q′1, by whatever definition of better that we're using.

## Feature-split relations

The morphisms/relations defined so far use E and Q - but they don't make any use of F. Here is one definition that does make use of the feature structure.

Say that the generalised model M=(F,E,Q) is feature-split if F=⊔ni=1Fi and E=×ni=1Ei such that

Ei⊂2¯¯¯¯¯¯Fi.

Note that F=⊔ni=1Fi implies W=2¯¯¯¯F=×ni=12¯¯¯¯¯¯Fi, so ×ni=1Ei lies naturally within W.

Designate such a generalised model by M=({Fi},E,Q).

Then a feature-split relation between M0=({Fi0},E0,Q0) and M1=({Fi1},E1,Q1) is a morphism r that is defined as r=(r1,r2,…,rn) with ri a relation between Ei0 and Ei1.

I'm not fully sold on category theory as a mathematical tool, but it's certainly worthwhile to formalise your mathematical structures so that they can fit within the formalism of a category; it makes you think carefully about what you're doing. ↩︎

There is a slight abuse of notation here: r:2E0→2E1 and r−1:2E1→2E0 are not generally inverses. They are inverses precisely for the "r-stable" sets that are discussed further down in the post. ↩︎