Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

In part 1, I discussed the falsifiability of hypercomputation in a typed setting where putative oracles may be assumed to return natural numbers. In this setting, there are very powerful forms of hypercomputation (at least as powerful as each level in the Arithmetic hierarchy) that are falsifiable.

However, as Vanessa Kosoy points out, this typed setting has difficulty applying to the real world, where agents may only observe a finite number of bits at once:

The problem with constructive halting oracles is, they assume the ability to output an arbitrary natural number. But, realistic agents can observe only a finite number of bits per unit of time. Therefore, there is no way to directly observe a constructive halting oracle. We can consider a realization of a constructive halting oracle in which the oracle outputs a natural number one digit at a time. The problem is, since you don't know how long the number is, a candidate oracle might never stop producing digits. In particular, take any non-standard model of PA and consider an oracle that behaves accordingly. On some machines that don't halt, such an oracle will claim they do halt, but when asked for the time it will produce an infinite stream of digits. There is no way to distinguish such an oracle from the real thing (without assuming axioms beyond PA).

This is an important objection. I will address it in this post by considering only oracles which return Booleans. In this setting, there is a form of hypercomputation that is falsifiable, although this hypercomputation is less powerful than a halting oracle.

Define a binary Turing machine to be a machine that outputs a Boolean (0 or 1) whenever it halts. Each binary Turing machine either halts and outputs 0, halts and outputs 1, or never halts.

Define an arbitration oracle to be a function that takes as input a specification of a binary Turing machine, and always outputs a Boolean in response. This oracle must always return 0 if the machine eventually outputs 0, and must always return 1 if the machine eventually outputs 1; it may decide arbitrarily if the machine never halts. Note that this can be emulated using a halting oracle, and is actually less powerful. (This definition is inspired by previous work in reflective oracles)

The hypothesis that a putative arbitration oracle (with the correct type signature, MachineSpec → Boolean) really is one is falsifiable. Here is why:

  1. Suppose for some binary Turing machine M that halts and returns 1, the oracle O wrongly has O(M) = 0. Then this can be proven by exhibiting M along with the number of steps required for the machine to halt.
  2. Likewise if M halts and returns 0, and the oracle O wrongly has O(M) = 1.

Since the property of some black-box being an arbitration oracle is falsifiable, we need only show at this point that there is no computable arbitration oracle. For this proof, assume (for the sake of contradiction) that O is a computable arbitration oracle.

Define a binary Turing machine N() := 1 - O(N). This definition requires quining, but this is acceptable for the usual reasons. Note that N always halts, as O always halts. Therefore we must have N() = O(N). However also N() = 1 - O(N), a contradiction (as O(N) is a Boolean).

Therefore, there is no computable arbitration oracle.

Higher hypercomputation?

At this point, it is established that there is a form of hypercomputation (specifically, arbitration oracles) that is falsifiable. But, is this universal? That is, is it possible that higher forms of hypercomputation are falsifiable in the same setting?

We can note that it's possible to use an arbitration oracle to construct a model of PA, one statement at a time. To do this, first note that for any statement, it is possible to construct a binary Turing machine that returns 1 if the statement is provable, 0 if it is disprovable, and never halts if neither is the case. So we can iterate through all PA statements, and use an arbitration oracle to commit to that statement being true or false, on the basis of provability/disprovability given previous commitments, in a way that ensures that commitments are never contradictory (as long as PA itself is consistent). This is essentially the same construction idea as in the Demski prior over logical theories.

Suppose there were some PA-definable property P that a putative oracle O (mapping naturals to Booleans) must have (e.g. the property of being a halting oracle, for some encoding of Turing machines as naturals). Then, conditional on the PA-consistency of the existence of an oracle with property P, we can use the above procedure to construct a model of PA + existence of O satisfying P (i.e. a theory that says what PA says and also contains a function symbol O that axiomatically satisfies P). For any PA-definable statement about this oracle, this procedure will, at some finite time, have made a commitment about this statement.

So, access to an arbitration oracle allows emulating any other PA-definable oracle, in a way that will not be falsified by PA. It follows that hypercomputation past the level of arbitration oracles is not falsifiable by a PA-reasoner who can access the oracle, as PA cannot rule out that it is actually looking at something produced by only arbitration-oracle levels of hypercomputation.

Moreover, giving the falsifier access to an arbitration oracle can't increase the range of oracles that are falsifiable. This is because, for any oracle-property P, we may consider a corresponding property on an oracle-pair (which may be represented by a single oracle-property through interleaving), stating that the first oracle is an arbitration oracle, and the second satisfies property P. This oracle pair property is falsifiable iff the property P is falsifiable by a falsifier with access to an arbitration oracle. This is because we may consider a joint search for falsifications, that simultaneously tries to prove the first oracle isn't an arbitration oracle, and one that tries to prove that the second oracle doesn't satisfy P assuming the first oracle is an arbitration oracle. Since the oracle pair property is PA-definable, it is emulable by a Turing machine with access to an arbitration oracle, and the pair property is unfalsifiable if it requires hypercomputation past arbitration oracle. But this implies that the original oracle property P is unfalsifiable by a falsifier with access to an arbitration oracle, if P requires hypercomputation past arbitration oracle.

So, arbitration oracles form a ceiling on what can be falsified unassisted, and also are unable to assist in falsifying higher levels of hypercomputation.

Conclusion

Given that arbitration oracles form a ceiling of computable falsifiability (in the setting considered here, which is distinct from the setting of the previous post), it may or may not be possible to define a logic that allows reasoning about levels of computation up to arbitration oracles, but which does not allow computation past arbitration oracles to be defined. Such a project could substantially clarify logical foundations for mathematics, computer science, and the empirical sciences.

[EDIT: cousin_it pointed out that Scott Aaronson's consistent guessing problem is identical to the problem solved by arbitration oracles.]

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 11:22 PM

Your arbitration oracle seems equivalent to the consistent guessing problem described by Scott Aaronson here. Also see the comment from Andy D proving that it's indeed strictly simpler than the halting problem.

It's not obvious to me that "consistent with PA" is the right standard for falsification though. It seems like simplicity considerations might lead you to adopt a stronger theory, and that this might allow for some weaker probabilistic version of falsification for things beyond arbitration. After all, how did we get induction anyway?

(Do we need induction, or could we think of falsification as being relative to some weaker theory?)

(Maybe this is just advocating for epistemic norms other than falsification though. It seems like the above move would be analogous to saying: the hypothesis that X is a halting oracle is really simple and explains the data, so we'll go with it even though it's not falsifiable.)

In hyper-Solomonoff induction, indeed the direct hypercomputation hypothesis is probably more likely than the arbitration-oracle-emulating-hypercomputation hypothesis. But only by a constant factor. So this isn't really falsification so much as a shift in Bayesian evidence.

I do think it's theoretically cleaner to distinguish this Bayesian reweighting from Popperian logical falsification, and from Neyman-Pearson null hypothesis significance testing (frequentist falsification), both of which in principle require producing an unbounded number of bits of evidence, although in practice rely on unfalsifiable assumptions to avoid radical skepticism e.g. of memory.

It's an interesting coincidence that arbitration is the strongest thing we can falsify, and also apparently the strongest thing that can consistently apply to itself (if we allow probabilistic arbitration). Maybe not a coincidence?

I have a possibly-stupid question about the proof that access to an arbitration oracle can't help the verifier. I understand why a single arbitration oracle A can be used to simulate a pair consisting of an arbitration oracle A' and another oracle satisfying P. But as part of that simulation, the oracle A has access to all the outputs of the oracle A', and can use them to construct a consistent model satisfying P. Isn't this slightly different than the actual setting that A would find itself in(trying to imitate a model P), where it would be faced with a verifier using an arbitration oracle A'' it has no access to? I have the sense that this couldn't be used to falsify that A is implementing P, since A' and A'' can only differ on the arbitrary answers they give to queries about machines that never halt. But I don't immediately see how to prove this.

If the falsifying Turing-computable agent has access to the oracle A' and a different oracle A'', and A' and A'' give different answers on some Turing machine (which must never halt if A' and A'' are arbitration oracles), then there is some way to prove that by exhibiting this machine.

The thing I'm proving is that, given that the falsifier knows A' is an arbitration oracle, that doesn't help it falsify that oracle B satisfies property P. I'm not considering a case where there are two different putative arbitration oracles. In general it seems hard for 2 arbitration oracles to be more helpful than 1 but I'm not sure exactly how to prove this. Maybe it's possible to use the fact that a single arbitration oracle can find two different arbitration oracles by creating two different models of PA?

The thing I’m proving is that, given that the falsifier knows A’ is an arbitration oracle, that doesn’t help it falsify that oracle B satisfies property P.

Hmm, let me put it this way -- here's how I would (possible incorrectly) summarize the proof: "For a given oracle O with property P, you can't falsify that O uses only arbitration-oracle levels of hypercomputation, even with access to an arbitration oracle A'. This is because there could be a Turing machine T in the background, with access to an arbitration oracle A, which controls the output of both A' and O, in such a way that A' being an arbitration oracle and O having property P cannot be disproved in PA"

I'm wondering whether we can remove the condition that the background Turing machine T controls the output of A', instead only allowing it to control the output of O, in such a way that "A' being an arbitration oracle and O having property P" cannot be disproved in PA for all possible arbitration oracles A'. This would be a scenario where there are two possibly-distinct arbitration oracles, the oracle A' used by the verifier and the oracle A used by the background Turing machine T.