A research agenda for the final year

Mitchell_Porter

Since the start of 2026, I've been thinking, suppose this is the final year before humanity loses control to AI. What should I do, where should I focus? I now have an answer. The plan is to tackle three questions:

What is the correct ontology?

What is the correct ethics?

What are ontology and ethics in an AI?

A few comments about my perspective on these questions...

What is the correct ontology?

The standard scientific answer would be to say that the world consists of fundamental physics and everything made from that. That answer defines a possible research program.

However, we also know that we don't know, how to understand anything to do with consciousness in terms of that framework. This is a huge gap since the entirety of our experience occurs within consciousness.

This suggests that in addition to (1) the purely physics-based research program, we also need (2) a program to understand the entirety of experience as conscious experience, and (3) research programs that take the fuzzy existing ideas about how consciousness and the physical world are related, and develop them rigorously and in a way that incorporates the whole of (1) and (2).

In addition to these, I see a need for a fourth research program which I'll just call (4) philosophical metaphysics. Metaphysics in philosophy covers topics like, what is existence, what is causality, what are properties, what are numbers - and so on. Some of these questions also arise within the first three ontological research programs, but it's not yet clear how it will all fit together, so metaphysics gets its own stream for now.

What is the correct ethics?

In terms of AI, this is meant to bear upon the part of alignment where we ask questions like, what should the AI be aligned with, what should its values be?

But I'm not even sure that "ethics" is exactly the right framework. I could say that ethics is about decision-making that involves "the Good", but is that the only dimension of decision-making that we need to care about? Classically in philosophy, in addition to the Good, people might also talk about the True and even the Beautiful. Could it be that a correct theory of human decision-making would say that there are multiple kinds of norms behind our decisions, and it's a mistake to reduce it all to ethics?

This is a bit like saying that we need to know the right metaethics as well as the right ethics. Perhaps we could boil it down to these two questions, which define two ethical research programs:

(1) What is the correct ontology of human decision-making?

(2) Based on (1), what is the ideal to which AI should be aligned?

What are ontology and ethics in an AI?

My assumption is that humanity will lose control of the world to some superintelligent decision-making system - it might be an AI, it might be an infrastructure of AIs. The purpose of this 2026 research agenda, is to increase the chances that this superintelligence will be human-friendly, or that it will be governed by the values that it should be governed by.

Public progress in the research programs above, has a chance of reaching the architects of superintelligence (i.e. everyone working on frontier AI) and informing their thinking and their design choices. However, it's no good if we manage to identify the right ontology and the right ethics, but don't know how to impart them to an AI. Knowing how to do so is the purpose of this third and final leg of the 2026 research agenda.

We could say that there are three AI research programs here:

(1) Understand the current and forthcoming frontier AI architectures (both single agent and multi-agent)

(2) Understand in terms of their architecture, what the ontology of such an AI would be

(3) Understand in terms of their architecture, what the ethics or decision process of such an AI would be

Final comments

Of course, this research plan is provisional. For example, if epistemology proved to require top-level attention, a fourth leg might have to be added to the agenda, built around the question "What is the correct epistemology?"

It is also potentially vast. Fortunately, in places it significantly overlaps with major recognized branches of human knowledge. One hopes that specific important new questions will emerge as the plan is refined.

A rather specific, but very timely question, is how a human-AI hivemind like Moltbook could contribute to a broad fundamental research program like this. I expect that the next few months will provide some answers to that question.

I've been meaning to ask - in what sense are some states of entangled electrons more objectively different from other states of entangled electrons, than some microstates are objectively different from other microstates when it comes to their function (in the sense of functionalism)?

For the reader who is unfamiliar: This refers to a position I have previously taken with respect to ontology of mind. I will put a version of it here in quotes, for future reference. I apologize in advance for the length and complexity of my discussion below.

I have said: Qualia exist objectively, but functional states are inherently vague when it comes to microphysical details. There are edge cases, there is a sorites problem, which prevent the definition of a functional state from being made microphysically exact for all physical states, in a non-arbitrary way. But it needs to be exact, if it is to be part of a psychophysical bridge law that specifies for each possible physical world, what qualia that world contains.

In a dilemma between computation-based and substrate-based theories of consciousness, I have therefore preferred the latter, in the sense that I prefer a theory of qualia which is based on states and properties whose existence is just as objective.

At the same time, I have also said: Conscious states are complex unities in which numerous qualia are united in some way; perhaps they correspond to entangled states, these being complex in various ways, while also not being tensor products (tensor products being the standard way to construct a mereological sum in quantum mechanics).

But @green_leaf is asking, how are entangled states any more objective than functional states?

This is a valid question because, if you look at things from the perspective of a wavefunction of the universe, everything is entangled with everything else! There is a risk that constructing exact states for subsystems of the universe will once again require arbitrary choices.

But first a few words on distinctness of states and exactness of properties in quantum theory.

First of all, let me emphasize that according to the Copenhagen interpretation of quantum theory, wavefunctions are not "elements of physical reality", they simply codify the knowledge of an observer. The elements of physical reality are the "observables". Schrodinger and Einstein criticized this framework as necessarily an incomplete description of reality.

The most elegant heir to the Copenhagen interpretation is what I'll call the Hartle multiverse model (though Gell-Mann and Omnes worked on it too). This has a wavefunction of the universe, then a set of observables (e.g. field values and/or field momenta at particular space-time locations) whose possible values define a set of possible histories. If these histories all satisfy the technical property of being mutually decoherent, then each history inherits an apriori probability from the universal wavefunction, and you can derive ordinary quantum mechanics from conditional probabilities within this ensemble of possible histories.

This formalism in itself is not yet a full-fledged ontological interpretation of quantum theory. For that, I add the further postulate that these decoherent histories are maximally fine-grained - you can't add any more observables while retaining the decoherence condition. This does not yet single out a unique ensemble - Dowker and Kent pointed out that there's a vast number of choices for the maximally fine-grained observables.

But a few extra postulates might suffice to single out a unique ensemble. Maybe a rule, similar to a cellular automaton rule, that determines the observables. Maybe a principle that the apriori probabilities must all be equal. At this point you'd have a multiverse theory with no ambiguity about what is posited to exist, and no problem of some worlds having a larger probability measure than others.

That's all a digression but I'll return to it later.

Strictly speaking, according to the Copenhagen interpretation, wavefunctions are not fundamental physical entities, they are just epistemic states. However, most quantum physicists talk like defacto wavefunction realists, and any choice of definite values for the observables can be encoded in a wavefunction, the corresponding eigenstate. So I'll talk as a wavefunction realist from now on.

Returning finally to distinctness and exactness... An eigenfunction of an observable is definitely exact. If the observable also has a discrete spectrum of possible values, such as the energies of an electron bound to an atomic nucleus, the eigenfunctions will also be inarguably distinct: the different orbitals in an atom are separated from each other by a quantum jump in the energy.

However, it's the exactness of the state that I was after. I have no problem with a continuum of quantum states being mapped onto a continuum of qualic states. I have a problem with psychophysical mappings which get microphysically vague on the physical side, because if we ask about an edge case, what qualia are present, there's no definite answer. At worst, you could even end up with no definite answer about whether or not a given possible physical world contains a person, a conscious being.

Now let us consider entangled wavefunctions. They give us a whole new set of properties which, in principle, could be part of a psychophysical correspondence between quantum and quale. There are not only the various measures of entanglement, which quantify how much entanglement is present; there are the different forms of multipartite entanglement (e.g. Borromean states, a form of tripartite entanglement analogous to the Borromean rings, no two of which are linked, but which as a trio cannot be separated). I'm not really sure how rich these possibilities are, but they are a novel kind of physical property on which conscious states might supervene.

However, I already mentioned the issue that validates @green_leaf's question: if the universal wavefunction is the ultimate objective description of the physical world, then everything is entangled with everything else. For example, all occurrences of any given species of fermion, such as all electrons, are antisymmetrically entangled with each other. This is implied by the spin-statistics theorem, and this is what implements the Pauli exclusion principle, that keeps the electrons (in atoms and molecules) in their separate orbitals. Wavefunctions describing just a few entangled entities, such as show up in quantum chemistry and quantum computing, are truncations of this universal entanglement, and have no particular claim to objective significance. There is a psychophysical sorites problem, not just for functionalism, but for "wavefunctionalism".

It is possible that dynamics within the universal wavefunction does produce localized temporary examples of complete disentanglement. Maybe a natural mereology could be built on this. But otherwise, my only counter-proposal would be a version of the maximally fine-grained Hartle multiverse which, to my knowledge, has never been investigated: one in which the observables, the elements of physical reality, are "multipartite" in some way. Since in fundamental physics we deal with quantum fields, I think the logical candidates are observables associated with extended objects, like "Wilson loops" and "surface operators". Interestingly, Lee Smolin worked both on a version of loop quantum gravity in which the physical states are eigenfunctions of gravitational WIlson loops, and on a version of "quantum causal histories" which might be sufficiently general to allow for a Hartle multiverse with multipartite observables. It would be interesting to implement something like these in a well-explored modern framework like AdS/CFT.

If something like this turns out to be viable, not just as physics but as psychophysics, then functionalism's emphasis on causality and representation will still be relevant! It's just that to produce specific conscious states, casual structure alone would not be enough, the substrate would need to be these fundamental extended observables, and not virtual state machines running at a more coarse-grained level of description.