Towards building blocks of ontologies

Alex_Altair; Dalcy; Alfred Harwood; JoseFaustino

This dialogue is part of the agent foundations fellowship with Alex Altair, funded by the LTFF. Thank you Dalcy, Alex Altair and Alfred Harwood for feedback and comments.

Context: I (Daniel) am working on a project about ontology identification. I've found conversations to be a good way to discover inferential gaps when explaining ideas, so I'm experimenting with using dialogues as the main way of publishing progress during the fellowship.

Daniel C

We can frame ontology identification as a robust bottleneck for a wide variety of problems in agent foundations & AI alignment. I find this helpful because the upstream problems can often help us back out desiderata that we want to achieve, and allow us to pin down theories/solutions that we're looking for:

Suppose that you have a neural network with a bunch of layers and activations, and you're able to observe the value of the activations of a particular neuron.
- On one hand, merely knowing the activations is completely insufficient for us to interpret the "meaning" of that activation: we don't know what the activation is pointing to in the real world, or what we can infer about the world upon observing an activation value. This is because we have no idea how that activation value is computed from the input layer or how it is used by variables down stream. This "relational information" - how it interacts with other neurons - is part of what defines the semantics of that neuron. Neurons would need to include this relational information for us to fully interpret their meaning.
- On the other hand, we don't want to include all information about the network because we want to think of that activation value as a low-dimensional summary of what's going in the neural network. Many inputs can produce the same activation value at that neuron, and when we're just looking at a neuron we don't need to be able to distinguish between inputs that produce the same activation values.
- When it comes to interpretability, what we want is a "unit" of the world model such that in principle, I can look at that unit in isolation and be able to interpret its meaning, without having to inspect its relationship with everything else in the network.
- The key idea here is that the relational information of how a neuron interacts with the rest of the network is part of what defines the neuron itself, and in order to have a self-sufficient unit of the world model, we must "pack" that relational information within the unit itself in order to be able to make sense of it in isolation.
- Analogy: Suppose that in the future we have a "universal intepreter", where we can throw in a part of an AI's world model and receive a natural language interpretation of what we can infer about the real world given that part of the world model. We can't just throw in the parameters of a neuron because as far as the interpreter is concerned, the neuron could be situated anywhere in the network. So what is the minimal amount of relational information we need to add to so that the universal interpreter can interpret it? And how should we represent that?
Higher-order terms:
- In natural language we have a lot of higher-order terms (like the word "behind") which are about relationships between other objects/variables, and those terms can often be applied to objects/variables that we haven't even conceived of.
- For instance, there might be two hypothetical objects A and B that I don't know about yet, but once I know about them I can coherently say "A is behind B" and instantly understand what that means.
- This presents a challenge if we choose to use bayesnets to represent an agent's ontology, because variables/nodes in a bayesnet are defined in terms of their causal relationships with the other current variables. Thisdoesn't tell us how a variable might relate to potentially new variables that don't even exist yet.
- In addition, higher-order terms can be applied to many different contexts (e.g. I can say "X is behind Y" for many different possible Xs and Ys), but in each instance we want to think of them as containing the "same" higher-order term even though the higher-order term is connected to different variables.
- In order to accommodate these requirements, we need to think of higher-order terms like "behind" as something that is separable from the specific connection it has with specific objects, where we can put it in a particular context and be able to derive what relationship it should have with that context.
Natural latents:
- Natural latents have been framed as redundant information. We're looking for information that could be derived by observing a wide variety of different variables, and once we derived that piece of information, we can then use that piece of information to make predictions on a wide variety of contexts and variables
- One subproblem is that in a world model, information could be represented in many different "formats" in different places, which means that if we want to discover a natural latent inside a world model, the latent needs to be able to aggregate information in a way that can "adapt" to many different formats. We must represent the natural lantent itself in a format that is "interpretable" by many other parts of the world model (so that we can make predictions in "many different places").
- Similar to before, we don't want our representation of the natural latent to be bound by its specific relationship with specific variables. In some sense, when we place a natural latent in a particular context, we want our representation of the natural latent to "figure out" what relationship it should have with that context (whether for prediction or aggregating information) in a way that's generalizable across contexts.
Takeaways:
- The relationship between a variable and other variables is part of what defines the variable itself. This makes analyzing the meaning of a variable on its own a lot more difficult.
- A lot of human concepts have relationships that generalize across a wide variety of other concepts, which means we want to be able to separate these concepts from the specific context that they're in.
- In order to do this, we need to structure the concept in a way that contains some relational information about how it interacts with other variables, but leave out other relational details that doesn't belong to the concept.

Dalcy

Rephrasing in my terms:

The meaning/semantics of a node comes from its low-dimensional summary of its "relational information" with the rest of the network.
In [Bayes Nets / Structural Causal Models] these relational information are not treated as fundamental, but (if applicable) rather derived (?)
1. e.g., perhaps the [relational information of a node in a SCM] is the [set of outgoing & incoming structural equations and the nodes they point to] $)$ . This is kind of awkward, and not treating them as fundamental makes operations over them (e.g., like how these "relations" can be copied and reused over time in a Dynamical Bayes Net) also awkward.
So perhaps there is a modeling formalism that would treat these relational information as fundamental, and so operations over them would be less awkward.
1. One basic property of such a modeling formalism is making the "relational information" into an explicit variable (rather than a derived thing) that other elements of the formalism can directly access.
2. Another property is its ability to model the relational information of relational information themselves, to allow hierarchical modeling.

Does that sound broadly right?

A lot of human concepts are concepts whose relationship generalize across a wide variety of other concepts, which means we want to separate that concept with the specific context that it's in
In order to do this, we need to structure the concept in a way that contains some relational information about how it interacts with other variables, but leave out other relational details that we want to abtract over

This part sounds important, but I don't get it.

Daniel C

Does that sound broadly right?

Yep that sounds like what I had in mind

One basic property of such a modeling formalism is making the "relational information" into an explicit variable (rather than a derived thing) that other elements of the formalism can directly access.

And importantly, this allows us to move things like higher-order terms/natural latents across different contexts and still be able to make sense of their meaning in that context.

This part sounds important, but I don't get it.

So when you have a higher-order term like "behind", it's a term that generalizes across a wide variety of contexts (we can say "A is behind B" for a wide variety of As and Bs). So our mental representation of the word "behind" should contain the "relational information" that tells us how it interacts with a given context, but we also want to abstract over/throw out contextual information that is way too specific (e.g. what objects A and B are in a specific instance: "behind" shouldn't be defined as a particular spatial relation to a table or a cat or a house or any other specific object.)

Daniel C

Another angle/motivation I'm thinking of is in the context of solomonoff induction:

Suppose that we're doing Solomonoff induction and our hypothesis (the shortest program that reproduces our current observations) keeps updating as we receive new observations.
One subgoal of ontology identification is that we want to isolate "concepts" within our hypothesis that tend to remain invariant/stable as we update our hypothesis, so that we can still be confident that the concept is "pointing" to the same thing in the real world even when our world model changes. As humans, we can often tell from introspection that our concept of e.g. a strawberry mostly remains the same even when we go on to learn new things about the fundamental particles that strawberries (and everything else) are made of; new discoveries in physics don't require us to throw out our concept of a strawberry.
If we're looking for concepts that remain invariant as we update our world model, those concepts must be present in multiple hypotheses that are compatible with our current observations. So one thing that we could look for when trying to find these "invariant concepts" is information that is redundantly represented across a wide variety of likely hypotheses given our current observations.
The mental image is something like:
- We have a representation of the redundant information/minimal latent among the likeliest hypothesis compatible with our current observations. The minimal latent contains at least all the information that is present in all of the potential hypothesis that we're considering.
  - Note that in principle, the shortest program reproducing our current observation should be able to capture this minimal latent (since any other valid hypothesis must agree with this shortest program about our old observations).
  - However, it's quite hard to compute or pinpoint exactly what properties of the shortest program are preserved when we update to a longer program to match new observations, so we want a representation that makes this easier.
- We can represent any hypothesis compatible with existing observations as the minimal latent, plus + additional information that isolates a single hypothesis given our minimal latent.
- If we can decompose any hypothesis in this way, this narrows down the "search space" for the "invariant concepts" that we're looking for, since we know that it must be within the minimal latent that remained invariant during updates of hypotheses.
I think this is related to the idea that human concepts generalize across a wide variety of contexts, because if we think of human concepts as both
- The objects that are carrying out computations of the world model
- The components of the minimal latent
Then when we update towards a hypothesis by adding additional information on top of the minimal latent, the concepts must be able to adapt/generalize to that additional information. Since concepts carry out computations, they must be taking the "additional information" as input, but continue to generalize by e.g. still reproducing the existing observations.
Importantly, the relationship between human concepts and the "additional information" that updates our hypothesis isn't prespecified anywhere outside of the program, the human concepts must "figure out" what relationship it should have with that "additional information".

Alfred Harwood

This seems exciting but I don’t fully understand! Maybe this example can help clear up where I’m struggling.

Humans have a kind of in-built model of physics which encodes naive pre-Newtonian intuitions like “If I shove a thing, it will move”. As we learn more about physics, we learn that this model of the universe is wrong and we update it with relativity/quantum mechanics/whatever. But if I have to pick up a chair and move it to a different room, I’m not using relativity to work out how I should move it, I’m using my pre-Newtonian intuitions. So in some sense, that instrumental part of my world model has remained unchanged despite me updating my beliefs. But I don’t think that this means that elements of my ontology have stayed the same. Modern physics is ontologically completely different to naive physics. It seems to me that upon learning modern physics, one’s ontology changes completely, but there is still some instrumental value in keeping the old ontology around to be used as a quick and dirty (and computationally cheap) approximation when I need to pick up a chair. But I don’t think this is the same thing as saying that the concepts have remained ‘invariant’ as one goes from using naive physics to modern physics.

For this example would you say that, upon the agent learning modern physics, the ontology has changed almost entirely (because the principles/concepts behind the different models of the world are completely different) or only a little bit (because learning modern physics doesn’t affect the majority of actions that an agent takes)? Or something else?

Daniel C

So in this example we have two possible viewpoints:

We have the correspondence principle, where we want new theories to reproduce old theories for all cases where the old theories were known to be valid. This means that we have some information which is shared between the old theory and the new theory, and to get to the new theory we only have to specify information about how the new theory differs from the old one, which is much simpler than specifying the new theory entirely from scratch.
- For instance, one way to arrive at quantum mechanics is to just start from classical mechanics and replace the functions with operators (e.g. momentum $\to$ momentum operator). Specifying that transition is much simpler than specifying all of quantum mechanics from scratch.
On the other hand, the ontologies of Newtonian mechanics and quantum mechanics do seem completely different, especially when you take them to be claims about what is true about the world.

I think both of these viewpoints are reasonable and valid, but for the purpose of ontology identification, we want to take the first perspective because:

Whenever we're trying to do ontology identification, we only have access to existing observations and existing theories.
Once we have completed ontology identification, it needs to continue to work even when we update to new theories to accommodate new observations.
This means that whatever concept we can identify, it must be contained in the information that is shared between the old theory and the possible new theories. The first viewpoint makes it easier for us to isolate that shared information.

What this means is that we want to structure our concepts in a way that can adapt to ontology shifts: My mental representation of a chair should only capture the information that is shared between a wide variety of "theories about chairs". I might currently believe that chairs are made of atoms, but if it turns out that they're made of quantum fields, I can still carry on making the same predictions about chair-related things because my concept of a chair does not rely on a specifc theory about "what chairs are".

Inductive relational completeness

Daniel C

So now I want to introduce some minimal examples for how we can have a "unit" of a world model that "packs" enough relational information inside that unit such that we can interpret its meaning in isolation, without having to reference anything else in the world model. We'll call this property relational completeness, and we write $R (x)$ for " $x$ is relationally complete/we can interpret the semantics of $x$ from $x$ itself".

An example of something that is not relationally complete is the parameters and activations of a particular neuron, because the parameters do not tell us where the neuron is located inside the network, which is part of what defines the "semantics" of the neuron's activation (i.e. what is implied by the neuron's activation).

To demonstrate a minimal example of something that is "relationally complete", we make the following assumption:

Sensory inputs are relationally complete (we assume that we can interpret the semantics of sensory inputs in isolation, without having to reference anything else in the world model).
- We've previously mentioned that the parameters and activations of a neuron is not relationally complete because we need to add additional information from the network to interpret its meaning. In contrast, the raw sensory inputs is relationally complete, in the sense that nothing else in the network can help interpret the semantics of sensory inputs, as everything in the network is derived from the sensory inputs.
A piece of information is relationally complete if it implies the equivalence class over sensory input (histories) that would produce that piece of information.
- This is similar to how a macrostate induces an equivalence class over microstates, and we interpret the equivalence class as the "semantics" of the macrostate.

Given these assumptions, we want to demonstrate that relational completeness is a "compositional" property where the relational completeness of a component $C$ "enables" the relational completeness of other components that depends on $C$ . We do this by considering the following induction proof sketch:

Base case: Sensory inputs are relationally complete by assumption.
A set of relationally complete objects is relationally complete.
- This is because according to our assumption, each relationally complete object corresponds to an equivalence class of sensory inputs, so we can interpret a set of relationally complete objects as the intersection of all equivalence classes corresponding to objects in that set.
Let $f : A \to X$ be a specification of a function (e.g. a binary string representation of a Turing machine) where each $a \in A$ is a set of relationally complete objects, then the pair $(f, x)$ (where $x \in X$ ) is relationally complete.
- We interpret the pair $(f, x)$ as the equivalence class ${a | x = f (a), a \in A}$ . Since each $a$ is relationally complete, and $(f, x)$ corresponds to an equivalence class over $a$ , we conclude that $(f, x)$ is relationally complete.
Just to spell out what this means more concretely:
- We can treat sensory inputs (histories) $X_{0}$ as zeroth order variables which are relationally complete.
- We can have first order variables $(f_{1}, x_{1}) \in X_{1}$ where $f_{1}$ is a function over subsets of $X_{0}$ , which are relationally complete by hypothesis.
- We can have any nth order variables $(f_{n}, x_{n}) \in X_{n}$ where $f_{n}$ is a function over subsets of $X_{n - 1}$ , which are relationally complete by induction.

The property that I want to zoom in on is that each $f$ only specifies its "local" relationship with the variables that it directly interacts with (i.e. the variables that $f$ directly takes as input), but in order for something to be relationally complete, we would expect that it has to contain information about its global relationship all the way down to the sensory inputs, since that's what it takes for an object to encode an equivalence class over sensory inputs (which is how we define the semantics of an object in this setting). However, in this case it seems like we can achieve relational completeness just by including "local" relational information.

The intuition behind this is that when we have an object that is relationally complete, by definition, all information about the semantics of that object is contained within the object itself; any relevant relational information about how that object is computed from upstream variables is already contained in the object, which means that when we try to derive downstream variables on top of that object, we don't need to go back upstream to retrieve relational information.

In other words, a relationally complete object mediates between the semantic information between upstream and downstream variables, and this is what allows relational completeness to be a compositional property, where the relational completeness of upstream objects enables the relational completeness of downstream variables.

An analogy of this is if you're playing the game of telephone, you can think of a "relationally complete" messenger as a messenger who can fully explain how the current message is derived from the original source message, and once you have access to such a messenger, you don't need to go back upstream to ask the previous messengers anymore, and it also makes it easier for you to become a "relationally complete" messenger yourself because they pass that information onto you (which is where compositionality comes in).

Alfred Harwood

Cool! Let me see if I understand. So you have a proof that if you take a set of relationally complete objects and apply a computable function, then the resulting set (along with a specification of the function) is also relationally complete. This is because you can run the function on all possible $a$ -values to find out which a-values generate which $x$ -values and then 'import' the meaning from the set $A$ to the corresponding elements in set $X$ .

You can then apply this iteratively/inductively, so that a repeatedly applying functions leads to more relationally complete sets. You then postulate that sensory input is relationally complete, so that gives the first step upon which you can then build the inductive proof. (Tell me if this is right so far!) Glancing at it, I think I buy this proof.

The thing that I'm not sure about is whether sensory inputs actually are relationally complete in the sense you describe. Are you just postulating that they might be in order to get the proof going, or is there a strong reason for thinking that they are?

Most likely I'm misunderstanding the concept of relational completeness, but how is it possible that the 'meaning' of sensory input is interpretable in isolation? If two people are listening to the same piece of spoken word audio but one of them understands the language being spoken and the other doesn't, they will ascribe a different meaning to it, even if their sensory inputs are exactly the same. Could you flesh out what it means in practice for sensory inputs to be relationally complete? Alternatively, are there any other obvious/simple examples of relationally complete objects?

Daniel C

You can then apply this iteratively/inductively, so that a repeatedly applying functions leads to more relationally complete sets. You then postulate that sensory input is relationally complete, so that gives the first step upon which you can then build the inductive proof. (Tell me if this is right so far!) Glancing at it, I think I buy this proof.

Yep that seems correct to me! (P.S. I intentionally made an error for simplification which I'll mention later)

The thing that I'm not sure about is whether sensory inputs actually are relationally complete in the sense you describe. Are you just postulating that they might be in order to get the proof going, or is there a strong reason for thinking that they are?

Good question. So I should clarify that when I say an object $O$ is not relationally complete, I expect that I need to add something else in the world model such that" $O$ + that something else" will be relationally complete. In the neural network example, the parameters + activations of a neuron aren't relationally complete because I need to add information about where that neuron is located inside the network relative to everything else.

An implicit assumption is that all information about semantics must come from the world model, and we consider sensory variables relationally complete because they are fundamental in the sense that they are used to derive everything else and aren't derived from anything else.

A longer answer is that sensory observations are macrostates which induce an equivalence class over the set of environments (microstates) that can result in those sensory observations, and that equivalence class is the actual "semantics" of those sensory observations. Importantly, "semantics" in this sense is an objective, observer-independent property, and that still holds even when different observers ascribe different "subjective" meaning to those sensory observations.

So when it comes to ontology identification, we want to make sure that we can isolate relationally complete components from the world model in the "observer-independent" semantics sense. But after that, we have to make sure that we as observers are making the correct interpretations about those relationally complete objects, which is an additional task.

Function calls and order invariance

Daniel C

So I actually cheated a little in this step of the proof sketch:

4. Just to spell out what this means more concretely:
We can treat sensory inputs (histories) $X_{0}$ as zeroth order variables which are relationally complete
We can have first order variables $(f_{1}, x_{1}) \in X_{1}$ where $f_{1}$ is a function over subsets of $X_{0}$ , which are relationally complete by hypothesis
We can have any nth order variables $(f_{n}, x_{n}) \in X_{n}$ where $f_{n}$ is a function over subsets of $X_{n - 1}$ , which are relationally complete by induction

because I'm assuming an order about which functions are applied after which other functions, but that information is not specified in the variables themselves. For these variables to actually be relationally complete, we need to encode that information within the objects themselves; we can't have any overarching structural information outside of those objects.

To fix this, we need to somehow add another type of entity to the pair $(f, x)$ that allows us to encode the order of how the functions are applied inside the objects themselves, so that we don't have to impose a structure outside of the objects. In addition, we want the resulting relationally complete object to be maximally expressive: For instance, we don't want our relationally complete object to only support a fixed computational DAG; we want the ordering of function composition to be able to dynamically adapt to the context. A useful analogy is to think about function calls in regular programs:

Suppose that we're currently executing a function $A$ inside a program.
$A$ might need the result of a computation that is implemented by some other function $B$ , but the result hasn't been computed yet, so we execute $B$ first, allowing $A$ to access the result afterwards.
This means that $A$ has some local rule which tells us "What computation result does $A$ need but that hasn't been computed yet?", and we can use that local rule to figure out the order of applying functions.
- Importantly, the local rule can depend on the state of the program ( $A$ can decide to call different functions depending on the state of the program).
- In addition, once we finish computing $B$ , that result will be stored in the state of the program, so we can again use the local rule to decide if we want to call another function. In other words, the result of a computation can tell us what we need to compute next.
Similar to $A$ , $B$ may also have rules about what computation it needs and it will use that to call other functions that do the same, and this is one of the ways that programs made out of simple functions can perform computations of arbitrary depth.

Our goal is to take this sort of structure and use that to encode the order of function composition inside the relationally complete objects themselves, so that we don't need to specify any additional structure on top of those relationally complete objects. To do this, we need to add an object $r$ with a particular type signature so that each relationally complete object is a tuple $(f, r, x)$ , and we should be able to figure out the order of function composition (which may be context dependent) just by looking at the collection of relationally complete objects:

We define a context $c$ as the collection of relationally complete objects ${(f_{1}, r_{1}, x_{1}) . . .}$ whose value ( $x_{i}$ ) has already been computed. This will serve as the input for all objects $(f^{'}, r^{'})$ that have not been computed.
We define $r$ in the following way:
- Let $F$ be the type of uninstantiated objects $(f, r)$
- We define $r$ as a function $r : C \times F \to b o o l$ , where $C$ is the context type and $b o o l$ is boolean
- Suppose that we're currently trying to compute the uninstantiated object $(f, r)$
- If our current context is $c$ and we have an uninstantiated object $(f^{'}, r^{'})$ , then $r (c, (f^{'}, r^{'})) = T r u e$ implies that we shall compute $(f^{'}, r^{'})$ and add it to the context before we compute the value of $(f, r)$
- More concretely, let the set of "function calls" be
  - $S_{r} (c) = {(f^{'}, r^{'}) | r (c, (f^{'}, r^{'})) = T r u e}$ ,
  - we compute the value of these objects which results in a set of relationally complete object $c_{r} = {(f^{'}, r^{'}, x^{'}) | (f^{'}, r^{'}) \in S_{r}}$ where each $x^{'}$ is the result of each computation
  - we add this set to the context to form a new context $c^{'} = c \cup c_{r}$
- We repeat the procedure but this time with the new context: Find the set of function calls $S_{r} (c^{'}) = {(f^{'}, r^{'}) | r (c^{'}, (f^{'}, r^{'})) = T r u e}$ , compute them to form a collection of objects $c_{r}^{'}$ , add it to the current context to form a new context $c^{''} = c^{'} \cup c_{r}^{'}$ , then keep repeating the same procedure until $S_{r} (~ c) = \emptyset$ (i.e. No more function calls are required)
- Once we reach a context $~ c$ where $S_{r} (~ c) = \emptyset$ , we finally instantiate the object as $(f, r, f (~ c))$ and we add this to the context $¯ c = c \cup {(f, r, f (~ c))}$
In pseudocode:
- $I n s t a n t i a t e$ $(c, (f, r))$ :
  - $S_{r} := {(f^{'}, r^{'}) | r (c, (f^{'}, r^{'})) = T r u e}$
  - while $S_{r} (c) \neq \emptyset$ :
    - $c_{r} := {I n s t a n t i a t e (c, (f^{'}, r^{'})) | (f^{'}, r^{'}) \in S_{r}}$
    - $c := c \cup c_{r}$
    - $S_{r} := {(f^{'}, r^{'}) | r (c, (f^{'}, r^{'})) = T r u e}$
  - return $(f, r, f (c))$
This procedure implements the properties that we want from the function call example: Each instantiation has some local rule (encoded in $r$ ) which tells us what other computations we need to instantiate given the current context. And once it receives the results of those computations, it can update on that information to execute further "function calls". In addition, each instance of the function call follows the exact same procedure, which allows us to have computations of arbitrary depth even if the individual objects $(f, r)$ is simple.
Going back to relational completeness: Our end goal is that we want to encode the ordering of function composition in a way that is
1. Maximally expressive: We can define an ordering relative to any context $c$ , which means the ordering may vary and adapt to different contexts.
2. Relationally complete: The ordering is encoded within the objects themselves and we don't need to specify any structural information outside of those objects. In other words, instantiation should be commutative:
  - If we have a context $c$ and we want to instantiate two objects $(f_{1}, r_{1}), (f_{2}, r_{2})$ . Then the order in which we execute the following statements should lead to the same result $c$
    - $c := c \cup I n s t a n t i a t e (c, (f_{1}, r_{1}))$
    - $c := c \cup I n s t a n t i a t e (c, (f_{2}, r_{2}))$
  - In other words, if the order of computation is entirely encoded within the objects, then the order in which we instantiate the object should not matter.
Now suppose that given a particulat context $c$ , we want the order of computation to be such that $(f_{1}, r_{1})$ is always executed after $(f_{2}, r_{2})$ , but we also want to satisfy the commutativity condition mentioned above. Then we can impose the following condition on the two objects:
- $r_{1} (c, (f_{2}, r_{2})) = T r u e$
- $I n s t a n t i a t e (c \cup I n s t a n t i a t e (c, (f_{1}, r_{1})), (f_{2}, r_{2})) = I n s t a n t i a t e (c, (f_{2}, r_{2}))$
The first condition essentially says that if $(f_{1}, r_{1})$ is instantiated first, then it will compute the result of $(f_{2}, r_{2})$ before calculating its own value. The second condition says that if $(f_{1}, r_{1})$ is instantiated first, that will have no effect on the result of instantiating $(f_{2}, r_{2})$ afterwards. We can also view this as a way of representing modularity (where each relationally complete object is only directly influenced by a few other objects in the context).
What this means is that $(f_{1}, r_{1})$ will effectively always be executed after $(f_{2}, r_{2})$ , no matter what order we choose to instantiate them. More generally, an order of computation can be defined relative to any given context $c$ , which allows the computational structure to adapt to the context
Why do we want this again? Recall that the ordering of computation/function composition was the missing piece of information that is specified outside of the individual objects themselves, and we've found a method to encode that order within the objects themselves in a way that is maximally expressive, and that was what's necessary for achieving relational completeness. So I can use these objects to perform a wide variety of computations, and if I have access to just a single instantiation $(f_{1}, r_{1}, x_{1})$ , then that tells me all the information I need to know (given that I know the collection of $F$ we're considering ) about the equivalence class of contexts that can result in that instantiation, the "function calls" that may be involved in the computation, or how that instantiation may be used in downstream computations. Adding that missing piece allows us to interpret the semantics of the relationally complete object in isolation, where we don't have to add any indexical/relational information about where it's situated in the world model.
Compositionality of relational completeness: As before, we might've expected that if the instantiation of an object requires a sequence of successive function calls, then we would need to include all of that information in order to achieve relational completeness. However, our procedure only requires $r$ to encode information about the additional objects $(f^{'}, r^{'})$ that are directly instantiated by $r$ , but not the objects that are indirectly instantiated (by $(f^{'}, r^{'})$ ). This is because each $r^{'}$ already tells us the local rules about the type of function calls that it will make, which means we don't need to go back upstream to retrieve that information. Once again, this demonstrates how relational completeness is a compositional property.

Splitting functions:

Daniel C

Imagine that we have two variables $x_{1}$ , $x_{2}$ where they have a functional relationship $f (x_{1}) = x_{2}$ . One of the ways of framing relational completeness is that we want to split the information about this function $f$ into two components $f_{1}$ and $f_{2}$ , such that we can rederive the relationship between $x_{1}$ and $x_{2}$ entirely from the pair $(f_{1}, x_{1}), (f_{2}, x_{2})$ . We want to think of $f_{1}$ as the information that "belongs to $x_{1}$ " and $f_{2}$ as the information that belongs to $x_{2}$ .

However, if these are the only two variables that we're considering, then it seems like there are various ways of splitting $f$ that are equally valid: We could consider putting all of the information about $f$ into $f_{2}$ while leaving $f_{1}$ empty, but the opposite choice of putting all information about $f$ into $f_{1}$ seems equally valid. In other words, there's no unique objective way to "pack" relational information inside the objects.

But now suppose that we have $n + 1$ variables $x_{1} . . . x_{n + 1}$ where $x_{n + 1}$ is computed from $x_{1} . . . x_{n}$ by $x_{n + 1} = \otimes_{i = 1}^{n} f^{i} (x_{i})$ where $\otimes$ represents some form of aggregation of information. In this case, we want to split the $n$ functions $f^{i}$ into $n + 1$ parts $f_{i}$ $(i \in {1, n + 1})$ , where $f_{i}$ represents the relational information associated with $x_{i}$ . Contrary to before, there is an "objectively correct" way of splitting the function in some sense: Namely, if there is some information that is redundantly represented in all (or multiple) $f^{i}$ 's, then we should put that information in $f_{n + 1}$ because that allows us to store only one copy of that information (whereas storing them in all of the $f_{i}, i \in {1... n}$ would result in multiple copies of the same information).

Our current formalization of relational completeness does enable this form of function splitting: Ignoring the $r$ component for a moment and consider two objects $(f_{1}, x_{1}), (f_{2}, x_{2})$ where $x_{1} = f_{1} (f_{2}, x_{2})$ . An equivalent way of expressing this is to curry the function $f_{1}$ , so that it takes in $f_{2}$ and returns a function that maps $x_{2}$ to $x_{1}$ . In other words, $f_{1} (f_{2})$ returns another function $g$ , and $g (x_{2}) = x_{1}$ .

We can then consider the case where $f_{1}$ may take a wide range of other functions/objects $f_{i}$ as argument, so that:

$g_{i} = f_{1} (f_{i})$
$g_{i} (x_{i}) = x_{1}$

Then suppose that there is some information that is represented in a wide variety of $g_{i}$ s, a simplicity prior forces us to shift that information into $f_{1}$ so that we only have to store one copy of the redundant information.

However, we don't currently have a way of doing the same thing on the output side: Suppose that we have $n + 1$ variables where $x_{i} = g^{i} (x_{1}), i \in {2, n + 1}$ , and we want to split it into $n + 1$ parts $f_{i}$ $(i \in {1, n + 1})$ where each $u_{i}$ is the relational information associated with $x_{i}$ . Similar to before, if there is some information redudantly represented across multiple $g^{i}$ s, we want to shift that information onto $u_{1}$ , so that we only store one copy of the information. The issue is that for a relational complete object $(f_{1}, x_{1})$ , $f_{1}$ is already preoccupied the role of capturing the redundant relational information on the input side, so we need something else to capture the redundant relational information on the output side.

One simple fix is to add another component $u$ to our relationally complete object so that each object is defined as $(f, r, u, x)$ , where $u$ represents the redundant information between the relationships from $(f, r, u, x)$ to other objects that use information from $(f, r, u, x)$ . Changing $u$ doesn't affect how $x$ is computed from other objects, it only affects how the information from $x$ is used.

We can also think of this modification as a way of adding expressivity to the objects: Originally, once we define how two objects $(f_{1}, r_{1}, x_{1})$ , $(f_{2}, r_{2}, x_{2})$ aggregate information from other objects (which are defined by $f$ and $r$ ), that fully defines the functional relationship between $(f_{1}, r_{1}, x_{1})$ and $(f_{2}, r_{2}, x_{2})$ and there are no additional degrees of freedom that allow us to change the functional relationship between them (without changing how they aggregate information from other objects). Adding the $u$ component gives us that additional degree of freedom, while also allowing us to capture the redundant relational information on the output side.

Another way of thinking about relational completeness is that we know each variable must be represented in some kind of format, and we want to associate each variable with a description of its format, so that downstream variables can take that description and figure out how to use the information from that variable. The first obvious piece of relevant description of a variable is "how that variable is computed from other variables", and that piece of information is captured by the $f$ and $r$ component, while $u$ represents all the rest of the description that is relevant. Note that this "description of format" is used by all downstream variables, which reflects that fact that it is redundantly represented across the relational information on the output side.

Why does this matter?

Daniel C

Ontology translation: When we try to interpret an AI's ontology, we don't really have the capacity to interpret the world model as a whole all at once. Instead, we need to break up the world model into components and interpret the semantics of the components that we care about. If we want to have a mapping from components of a world model to its semantics, a basic requirement is that each component must contain sufficient semantic information about itself, and this is what relational completeness aims to capture. On the flipside, suppose that we try to interpret a component of the world model that is not relationally complete, where we impose some overarching structural specification outside of the component that determines its semantics. In this case, we cannot be confident that our interpretation will remain valid as the AI updates its ontology, since the overarching structure might be modified, and that component's semantics will be modified as a result even when it seems to remain constant when we look at it in isolation. Achieving relational completeness is one prerequisite for gaining confidence that ontology translation can remain stable against ontology shifts.
Higher-order terms: I previously mentioned that a lot of human concepts like "behind" are higher-order terms which generalize across other objects, even objects that we haven't learned about yet. Given that higher-order terms occupy a substantial portion of human concepts, an important subtask of ontology identification is to understand how these higher-order terms can be represented. For our relational completeness framework: Suppose that we already have an object $(f, u, r)$ inside the world model, and sometime in the future the world model constructs/learns a new type of object $(f^{'}, u^{'}, r^{'})$ , then relational completeness implies that $(f, u, r)$ and $(f^{'}, u^{'}, r^{'})$ contain all the information about how they relate to each other, and insofar as $(f^{'}, u^{'}, r^{'})$ is similar to the other objects that $(f, u, r)$ regularly interacts with, $(f, u, r)$ would be able to generalize to that new object, in the same way that the term "behind" can generalize to objects that we currently haven't even conceived of.
- One way to make sense of why relational completeness allows objects to generalize to new objects is that given relational completeness, when the world model constructs a new object $(f^{'}, u^{'}, r^{'})$ , that new object contains all the semantic information that is relevant, such as how it aggregates information from other objects, or how it is used by other objects. Given this, an existing object $(f, u, r)$ can leverage all of that information to figure out how it wants to use this new object $(f^{'}, u^{'}, r^{'})$ .
- In contrast, in a setting without relational completeness (e.g. neural networks/ fixed computational DAGs), when we try to figure how a new variable $X$ should relate to an existing part of the computation $Y$ , the value of $X$ misses out a lot of relational information (e.g. how $X$ is computed) relevant for the semantics of $X$ , and it's no wonder that $Y$ can't figure out how it should use information from $X$ when most of the information about $X$ 's semantics is not accessible to $Y$
Minimal latent across potential hypotheses: I previously mentioned that the features of ontology we're trying to identify 1. must be derivable from existing observations, since that's all we have access to and 2. must continue to work in the future as we update our hypothesis. These two assumptions together imply that we're looking for the minimal latent across a wide variety of likely potential hypotheses. Now consider the following learning procedure:
- We select the smallest set $P$ of relationally complete objects that reproduces the existing observation. (Note that we can use a set of relationally complete objects to represent programs).
- In addition, we impose the objective such that if we sample a set of new relationally complete objects $O$ using the simplicity prior, then the resulting set $P \cup O$ still reproduces our existing observation with high accuracy.
Suppose that we found a $P$ that satisfies this property: Notice that due to the simplicity prior, each $P \cup O$ is a likely hypothesis that reproduces our existing observations, and we can pinpoint different hypotheses by varying $O$ , while the $P$ component mostly stays invariant. In other words, $P$ captures exactly the type of minimal latent that is redundantly represented across a wide variety of likely hypotheses given our existing observations. While that doesn't tell us everything we want to know about ontology identification, it does allow us to pinpoint a much smaller part of the search space of what we could be looking for. One of the reasons why relational completeness is important for this setup is that when each object contains all the relevant relational information about itself, modifications and augmentations (the $O$ component) of programs becomes much more straightforward because we don't need to specify additional relationships between the modification ( $O$ ) and the original program ( $P$ ): the modification already contains all of that information.
Implementing natural latents: Suppose that we're trying to find the natural latent of a collection of relationally complete objects (which we call observables), where the natural latent itself is represented by a relationally complete object. Relational completeness implies that the natural latent will have access to all of the information that defines the semantics of the observables, which makes the task of extracting the relevant latent a lot easier. In addition, we can expect natural latents to generalize to new contexts the same way that higher-order terms can generalize to objects we haven't seen before, because that new context will contain all the semantic information that defines how it should relate to the natural latent. A relationally complete natural latent can "figure out" how to aggregate information from a wide variety of contexts, and once that information is derived, a wide variety of contexts can adapt to that piece of information to make predictions.
In contrast, suppose that we're in a setting without relational completeness, such as trying to find natural latents in activations of a neural network: An immediate challenge is that most of the semantic information of the activations is just missing from the activations, which makes it difficult for us to find the minimal latent of that semantic information. To overcome this challenge, we essentially have to rederive that semantic information from somewhere else, such as by observing a wide range of samples. However, this doesn't tell us anything about how the natural latent should generalize to new activations that we've never seen before, and we have no guarantees that the natural latent will remain invariant since the relational/indexical information about those activations isn't guaranteed to remain invariant.

LESSWRONG
LW

LESSWRONG
LW

29

Towards building blocks of ontologies

29

29

Inductive relational completeness

Function calls and order invariance

Splitting functions:

Why does this matter?