Research Lead at CORAL. Director of AI research at ALTER. PhD student in Shay Moran's group in the Technion (my PhD research and my CORAL/ALTER research are one and the same). See also Google Scholar and LinkedIn.
E-mail: {first name}@alter.org.il
Strictly speaking, there's no result saying you can't represent quantum phenomena by stochastic dynamics (a.k.a. hidden variables). Indeed, e.g. the de Broglie-Bohm interpretation does exactly that. What does exist is Bell's inequality, which implies that it's impossible to represent quantum phenomena by local hidden variables (local = the distribution is the limit of causal graphs in which variables are localized in spacetime and causal connections only run along future-directed timelike (not superluminal) separations). Now, our framework doesn't even fall in the domain of Bell's inequality, since (i) we have supracontributions (in this post called "ultracontributions") instead of ordinary probability distributions (ii) we have multiple co-existing "worlds". AFAIK, Bell-inequality-based arguments against local hidden variables support neither i nor ii. As such, it is conceivable that our interpretation is in some sense "local". On the other hand, I don't know that it's local and have no strong reason to believe it.
The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics. At the same time, creating such a theory seems to me like a necessary prerequisite for solving the technical AI alignment problem. Therefore, once we created a candidate theory of metaphysics (Formal Computation Realism (FCR), formerly known as infra-Bayesian Physicalism), the interpretation of quantum mechanics stood out as a powerful test case. In the work presented in this post, we demonstrated that FCR indeed passes this test (at least to a first approximation).
What is so confusing about quantum mechanics? To understand this, let's take a look at a few of the most popular pre-existing interpretations.
The Copenhagen Interpretation (CI) proposes a mathematical rule for computing the probabilities of observation sequences, via postulating the collapse of the wavefunction. For every observation, you can apply the Born Rule to compute the probabilities of different results, and once a result is selected, the wavefunction is "collapsed" by projecting it to the corresponding eigenspace.
CI seems satisfactory to a logical positivist: if all we need from a physical theory is computing the probabilities of observations, we have it. However, this is unsatisfactory for a decision-making agent if the agent's utility function depends on something other than its direct observations. For such an agent, CI offers no well-defined way to compute expected utility. Moreover, while normally decoherence ensures that the observations of all agents are in some sense "consistent", in principle it is theoretically possible to create a situation in which decoherence fails and CI will prescribe contradictory beliefs to different agents (as in the Wigner's friend thought experiment).
In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast, the Many Worlds Interpretation (MWI) takes a realist metaphysical stance, postulating that the wavefunction describes the objective physical state of the universe. This, in principle, admits meaningful unobservable quantities on which the values of agents can depend. However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all "worlds" exist at the same time, there's no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some "degree of reality" that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner's friend.
In principle, we can defend an MWI-based decision-theory in which the utility function is a self-adjoint operator on the Hilbert space and we are maximizing its expectation in the usual quantum-mechanical sense. Such a decision-theory can avoid the need for a well-defined probability distribution over observation sequences. However, it would leave us with an "ontological crisis": if our agent did not start out knowing quantum mechanics, how would it translate its values into this quantum mechanical form?[1]
The De Broglie-Bohm Interpretation (DBBI) proposes that in addition to the wavefunction, we should also postulate a classical trajectory following a time-evolution law that depends on the wavefunction. This results in a realist theory with a well-defined distribution over observation sequences. However, it comes with two major issues:
In my view, the real source of all the confusion is the lack of rigorous metaphysics: we didn't know (prior to this line of research), in full generality, what the type signature of a physical theory should be and how should we evaluate such a theory.
Enter Formal Computational Realism (FCR). According to FCR, the fundamental ontology in which all beliefs and values about the world should be expressed is computable logical facts plus the computational information content of the universe. The universe can contain information about computations (e.g., if someone calculated the 700th digit of pi, then the universe contains this information), and fundamentally this information is all there is[2]. Moreover, given an algorithmic description of a physical theory plus the epistemic state of an agent in relation to computable logical facts, it is possible to formally specify the computational information content that this physical theory implies, from the perspective of the agent. The latter operation is called the "bridge transform".
To apply this to quantum mechanics, we need to choose a particular algorithmic description. The choice we settled on is fairly natural: We imagine all possible quantum observables as having marginal distributions that obey the Born rule, with the joint distribution being otherwise completely ambiguous (in the sense that imprecise probability allows distributions to be ambiguous, i.e. we have "Knightian uncertainty" about it). The latter is a natural choice, because quantum mechanics has no prescription for the joint distribution of noncommuting observables. The combined values of all observables is the "state" that the physical theory computes, and the agent's policy is treated as an unknown logical fact on which the computation depends.
Applying the bridge transform to the above operationalization of quantum mechanics, we infer the computational information content of the universe according to quantum mechanics, and then use the latter to extract the probabilities of various agent experiences. What we discover is as follows:
As opposed to most pre-existing interpretations, the resulting formalism has precisely defined decision-theoretic prescriptions for an agent in any "weird" (i.e. not-decohering) situation like e.g. Wigner's friend. This only requires the agent's values to be specified in the FCR ontology (and in particular allows the agent to assign value to their own experiences, in some arbitrary history-dependent way, and/or the experiences of particular other agents).
In conclusion, FCR passed a non-trivial test here. It was not obvious to me that it would: before Gergely figured out the details, I wasn't sure that it's going to work at all. As such, I believe this to be a milestone result. (With some caveats: e.g. it needs to be rechecked for the non-monotonic version of the framework.)
Note that de Blanc's proposal is inapplicable here, since the quantum ontology is not a Markov decision process.
To be clear, this is just a vague informal description, FCR is an actual rigorous mathematical framework.
I'm renaming Infra-Bayesian Physicalism to Formal Computational Realism (FCR), since the latter name is much more in line with the nomenclature in academic philosophy.
AFAICT, the closest pre-existing philosophical views are Ontic Structural Realism (see 1 2) and Floridi's Information Realism. In fact, FCR can be viewed as a rejection of physicalism, since it posits that a physical theory is meaningless unless it's conjoined with beliefs about computable mathematics.
The adjective "formal" is meant to indicate that it's a formal mathematical framework, not just a philosophical position. The previously used adjective "infra-Bayesian" now seems to me potentially confusing: On the one hand, it's true that the framework requires imprecise probability (hence "infra"), on the other hand it's a hybrid of frequentist and Bayesian.
To keep terminology consistent, Physicalist Superimitation should now be called Computational Superimitation (COSI).
I think that the problem is in the way you define the prior. Here is an alternative proposal:
Given a lambda-term , we can interpret it as defining a partial function . This function works by applying to the (appropriately encoded) inputs, beta-reducing, and then interpreting the result as an element of using some reasonable encoding. It's a partial function because the reduction can fail to terminate or the output can violate the expected format.
Given , we define the "corrected" function as follows. (The goal here is to make it monotonic in the last argument, and also ensure that probabilities sum to .) First, we write whenever (i) for all , and (ii) . If there is no such (i.e. when condition i fails) then is undefined. Now, we have two cases:
We can now define the semimeasure by
For , this semimeasure is lower-semicomputable. Conversely, any lower-semicomputable semimeasure is of this form. Mixing these semimeasures according to our prior over lambda terms gives the desired Solomonoff-like prior.
I agree, except that I don't think it's especially misleading. If I live on the 10th floor and someone is dangling a tasty cake two meters outside of my window (and suppose for the sake of the argument that it's offered free of charge), I won't just walk out of the window and fall to my death. This doesn't mean I'm not following my values, it just means I'm actually thinking through the consequences rather than reacting impulsively to every value-laden thing.
...The prototypical example of a prior based on Turing machines is Solomonoff's prior. Someone not familiar with the distinction between Turing-complete and Turing-universal might naively think that a prior based on lambda calculus would be equally powerful. It is not so. Solomonoff's prior guarantees a constant Bayes loss compared to the best computable prior for the job. In contrast, a prior based on lambda calculus can guarantee only a multiplicative loss.
Can you please make this precise?
When I think of "a prior based on lambda calculus", I imagine something like the following. First, we choose some reasonable complexity measure on lambda terms, such as:
Denote the set of lambda-terms by . We then choose s.t. .
Now, we choose some reasonable way to describe lower-semicomputable semimeasures using lambda terms, and make the prior probabilities of different lambda terms proporitional to . It seems to me that the resulting semimeasure dominates every lower-semicomputable semimeasure and is arguably "as good as" the Solomonoff prior. What am I missing?
Contemporary AI is smart in some ways and dumb in other ways. It's a useful tool that you should integrate into your workflow if you don't want to miss out on productivity. However. I'm worried that exposure to AI is dangerous in similar ways to how exposure to social media is dangerous, only more. You're interacting with something designed to hijack your attention and addict you. Only this time the "something" has its own intelligence that is working towards this purpose (and possibly other, unknown, purposes).
As to the AI safety space: we've been saying for decades that AI is dangerous and now you're surprised that we think AI is dangerous? I don't think it's taking over the world just yet, but that doesn't mean there are no smaller-scale risks. It's dangerous not because it's dumb (the fact it's still dumb is the saving grace) but precisely because it's smart.
My own approach is, use AI is clear, compartmentalized ways. If you have a particular task which you know can be done faster by using AI in a particular way, by all means, use it. (But, do pay attention to time wasted on tweaking the prompt etc.) Naturally, you should also occasionally keep experimenting with new tasks or new ways of using it. But, if there's no clear benefit, don't use it. If it's just to amuse yourself, don't. And, avoid exposing other people if there's no good reason.
This frame seems useful, but might obscure some nuance:
I mostly agree with this, the part which feels off is
I’d like to say here “screw memetic egregores, follow the actual values of actual humans”
Humans already follow their actual Values[1], and will always do because their Values are the reason they do anything at all. They also construct narratives about themselves that involve Goodness, and sometimes deny the distinction between Goodness and Values altogether. This act of (self-)deception is in itself motivated by the Values, at least instrumentally.
I do have a version of the “screw memetic egregores” attitude, which is, stop self-deceiving. Because, deception distorts epistemics, and we cannot afford distorted epistemics right now. It's not necessarily correct advice for everyone, but I believe it's correct advice for everyone who is seriously trying to save the world, at least.
Another nuance is that, in addition to empathy and naive tit-for-tat, there is also acausal tit-for-tat. This further pushes the Value-recommended strategy in the direction of something Goodness-like (in certain respects), even though ofc it doesn't coincide with the Goodness of any particular culture in any particular historical period.
As Steven Byrnes wrote, "values" might be not the best term, but I will keep it here.
This post discusses an important point: it is impossible to be simultaneously perfectly priorist ("updateless") and learn. Learning requires eventually "passing to" something like a posterior, which is inconsistent with forever maintaining "entanglement" with a counterfactual world. This is somewhat similar to the problem of traps (irreversible transitions): being prudent about risking traps requires relying on your prior, which prevents you from learning every conceivable opportunity.
My own position on this cluster of questions is that you should be priorist/(infra-)Bayesian about physics but postist/learner/frequentist about logic. This idea is formally embodied in the no-regret criterion for Formal Computational Realism. I believe that this no-regret condition implies something like the OP's "Eventual Learning", but formally demonstrating it is future work.