Oracle Induction Proofs

Diffractor

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Notation:

${0, 1}^{L}$ and ${0, 1}^{*}$ are the sets of all bitstrings of length $L,$ and the set of all finite bitstrings, respectively. Elements of these sets are denoted by $x$ . The length $i$ prefix of $x$ is denoted by $x_{: i}$ .

$B_{L}$ is the set of all satisfiable booleans with all variables having an index $\leq L$ , and $B$ is the set of all satisfiable booleans. Similarly, $B_{* L}$ and $B_{*}$ are the set of all booleans with all variables having an index $\leq L$ , and the set of all booleans. Elements of these sets are denoted by $B$ . $B \land B^{'}$ is also a boolean. Bitstrings $x$ may be interpreted as boolean constraints saying that the prefix of the bitstring must equal $x$ .

$f$ is a function $N^{+} \to N^{+}$ that is time-constructible, monotonically increasing, and $f (t) > t$ , which gives the runtime bound for traders.

$l$ is some function $N^{+} \to N^{+}$ upper-bounded by $2^{f (t)}$ that is time-constructible, monotonically increasing, and $l (t) > t$ . It gives the most distant bit the oracle inductor thins about on turn $t$ .

$d$ is the function $N^{+} \to N^{+}$ given by $d (t) = l (t) + ⌈ {log}_{2} l (t) ⌉ + 2 t + 7$ , giving the number of iterations of binary-search deployed on turn $t$ .

$δ$ is the function $N^{+} \to [0, 1] \cup Q$ given by $δ (t) = 2^{- (t + 3)}$ . It gives the proportion of the inductor distribution that is made up by the uniform distribution.

The operations $query(A,p)$ and $flip(p)$ take unit time to execute in our model of computation, though the input still has to be written down in the usual amount of time.

Given some arbitrary distribution $Δ$ over ${0, 1}^{L}$ , and a $B \in B_{* L}$ , $Δ (B)$ is the probability mass assigned to bitstrings that fulfill $B$ . Given a $B \in B_{L}$ , $Δ_{B}$ is the conditional distribution given by $Δ_{B} (x) = \frac{Δ (B \land x)}{Δ (B)}$ . $Δ_{B}^{*}$ is the distribution induced by $Approx(Δ,L,D,B)$ . There is an implicit dependence on $D$ which will be suppressed in the notation.

Lemma 1 (n-Lemma): Let $ϵ = 2^{- D}$ . Given some distribution $Δ$ over ${0, 1}^{L}$ for which $\forall x \in {0, 1}^{L} : Δ (x) \geq n ϵ$ , then

$\forall B \in B_{L}, x \in {0, 1}^{L} : {(\frac{n - 1}{n + 1})}^{L} Δ_{B} (x) \leq Δ_{B}^{*} (x) \leq {(\frac{n + 1}{n - 1})}^{L} Δ_{B} (x)$ .

To begin with, if $x$ is incompatible with $B$ , $Δ_{B}^{*} (x) = 0$ by the construction of $Approx$ , and the inequality is trivial. So we can consider the case where $x$ is compatible with $B$ . Given $σ$ as a strict prefix of $x$ , abbreviate $NextBit(Δ,L,D, | σ | + 1,B, σ)$ as $N B (σ)$ .

Because we're using a binary-search depth of $D$ on each bit, and taking the average of the interval, the probabilities are approximated to within $\frac{ϵ}{2}$ .

$\frac{Δ (B \land σ 1) - ϵ}{Δ (B \land σ) + ϵ} < \frac{Δ (B \land σ 1) - \frac{ϵ}{2}}{Δ (B \land σ) + \frac{ϵ}{2}} \leq P (N B (σ) = 1) \leq \frac{Δ (B \land σ 1) + \frac{ϵ}{2}}{Δ (B \land σ) - \frac{ϵ}{2}} < \frac{Δ (B \land σ 1) + ϵ}{Δ (B \land σ) - ϵ}$

Taking the midle three terms and subtracting $1$ from all of them, which flips the sign of the inequality, and rewriting $1 - P (N B (σ) = 1)$ as $P (N B (σ) = 0)$ and rewriting $1$ as $\frac{Δ (B \land σ) \pm \frac{ϵ}{2}}{Δ (B \land σ) \pm \frac{ϵ}{2}}$ , we get

$\frac{Δ (B \land σ) - ϵ}{Δ (B \land σ) + ϵ} < \frac{Δ (B \land σ) - \frac{ϵ}{2}}{Δ (B \land σ) - \frac{ϵ}{2}} - \frac{Δ (B \land σ 1) + \frac{ϵ}{2}}{Δ (B \land σ) - \frac{ϵ}{2}} \leq P (N B (σ) = 0) \leq \frac{Δ (B \land σ) + \frac{ϵ}{2}}{Δ (B \land σ) + \frac{ϵ}{2}} - \frac{Δ (B \land σ 1) - \frac{ϵ}{2}}{Δ (B \land σ) + \frac{ϵ}{2}} < \frac{Δ (B \land σ) + ϵ}{Δ (B \land σ) - ϵ}$

Now, because $Δ_{B}^{*} (x) = \prod_{i = 1}^{L} P (N B (x_{: i - 1}) = x_{i})$ , using the previous inequalities, we can establish

$\prod_{i = 1}^{L} \frac{Δ (B \land x_{: i}) - ϵ}{Δ (B \land x_{: i - 1}) + ϵ} < Δ_{B}^{*} (x) < \prod_{i = 1}^{L} \frac{Δ (B \land x_{: i}) + ϵ}{Δ (B \land x_{: i - 1}) - ϵ}$

For an individual term in the product of the lower/upper bound ( $\pm$ and $\mp$ will be used, the upper sign is used for the lower bound, the lower sign is used for the upper bound, and $⋛$ will be abused to mean $\geq$ for the lower bound and $\leq$ for the upper bound), we can get the following equality.

$\frac{Δ (B \land x_{: i}) \mp ϵ}{Δ (B \land x_{: i - 1}) \pm ϵ} = \frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1}) \pm ϵ} \mp \frac{ϵ}{Δ (B \land x_{: i - 1}) \pm ϵ} = \frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1})} \frac{Δ (B \land x_{: i - 1})}{Δ (B \land x_{: i - 1}) \pm ϵ} (1 \mp \frac{ϵ}{Δ (B \land x_{: i - 1})})$

Because, by assumption, all bitstrings with nonzero probability and therefore prefixes of them have probability bounded below by $n ϵ$ , we get

$\frac{Δ (B \land x_{: i - 1})}{Δ (B \land x_{: i - 1}) \pm ϵ} ⋛ \frac{n ϵ}{n ϵ \pm ϵ} = \frac{n}{n \pm 1}$ and $1 \mp \frac{ϵ}{Δ (B \land x_{: i - 1})} ⋛ 1 \mp \frac{ϵ}{n ϵ} = 1 \mp \frac{1}{n} = \frac{n \mp 1}{n}$

which can be applied to conclude

$\frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1})} \frac{Δ (B \land x_{: i - 1})}{Δ (B \land x_{: i - 1}) \pm ϵ} (1 \mp \frac{ϵ}{Δ (B \land x_{: i - 1})}) ⋛ \frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1})} \frac{n}{n \pm 1} \frac{n \mp 1}{n} = \frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1})} \frac{n \mp 1}{n \pm 1}$

Therefore, since we have lower-bounded or upper-bounded every term in the product, and $Δ (B \land x_{: L}) = Δ (B \land x) = Δ (x)$ , we can show

$Δ_{B}^{*} (x) ≷ \prod_{i = 1}^{L} \frac{Δ (B \land x_{: i}) \mp ϵ}{Δ (B \land x_{: i - 1}) \pm ϵ} ⋛ \prod_{i = 1}^{L} \frac{Δ (B \land x_{: i})}{Δ (B \land x_{: i - 1})} \frac{n \mp 1}{n \pm 1} = \frac{Δ (x)}{Δ (B)} {(\frac{n \mp 1}{n \pm 1})}^{L} = Δ_{B} (x) {(\frac{n \mp 1}{n \pm 1})}^{L}$

The n-Lemma is thus proven.

Lemma 2: The distribution induced by $OI(t)$ has the probability of all bitstrings bounded above by $n ϵ$ .

Identify $D$ with $d (t) = l (t) + ⌈ {log}_{2} l (t) ⌉ + 2 t + 7$ . Identify $L$ with $l (t)$ . Identify $n$ with $l (t) 2^{t + 4}$ .

Because $ϵ = 2^{- D}$ , $ϵ = 2^{- (l (t) + ⌈ {log}_{2} l (t) ⌉ + 2 t + 7)} \leq \frac{1}{l (t)} 2^{- (l (t) + 2 t + 7)}$ . Due to $δ (t)$ of the distribution being composed of the uniform distribution, the probability of all $x$ is bounded below by

$δ (t) 2^{- L} = 2^{- (t + 3)} 2^{- l (t)} = l (t) 2^{t + 4} \frac{1}{l (t)} 2^{- (l (t) + 2 t + 7)} \geq n ϵ$

And Lemma 2 is proved.

From here on, we will frequently be going from the approximation of a conditional distribution to a conditional distribution via the n-Lemma and Lemma 2, so let $ϵ_{t} = 2^{- d (t)}$ , and $n_{t} = l (t) 2^{t + 4}$ , and $L_{t} = l (t)$ . $Δ^{i}$ and $Δ_{B}^{i}$ will be used to refer to the probability distribution and conditional probability distribution over bitstrings that $OI(i)$ produces, and $Δ^{* i}$ and $Δ_{B}^{* i}$ refer to the approximation of that probability distribution with $D = d (t)$ .

Lemma 3: The assessed value at time $t$ in world $x$ of a trader $T$ with a runtime and oracle-call filter attached is lower-bounded by

$- t + \sum_{1 \leq i \leq t} \frac{max ({(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t}} \sum_{B \in B_{L_{t}}} (P (T^{i} = B) Δ_{B}^{i} (x)) - ϵ_{t}, 0)}{Δ^{i} (x) + ϵ_{t}}$ and upper-bounded by

$- t + {(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} \sum_{1 \leq i \leq t} \frac{\sum_{B \in B_{L_{t}}} P (T^{i} = B) Δ_{B}^{i} (x)}{Δ^{i} (x)}$ .

To begin, the assessed value at time $t$ in world $x$ of a trader $T$ by $OverBudget$ is $- t + \sum_{1 \leq i \leq t} \frac{(\sum_{B \in B_{L_{t}}} (P (T^{i} = B) Δ_{B}^{* i} (x)))^{*}}{Δ^{* i} (x)}$ .

$Δ^{* i} (x)$ has $d (t)$ rounds of binary search run on it, which yields an interval with a width of $ϵ_{t}$ , and an upper-bound estimate is taken, so $Δ^{i} (x) \leq Δ^{* i} (x) \leq Δ^{i} (x) + ϵ_{t}$ . Similarly, the net value of a trade has a lower-bound estimate run on it, so

$\sum_{B \in B_{L_{t}}} (P (T^{i} = B) Δ_{B}^{* i} (x)) - ϵ_{t} \leq (\sum_{B \in B_{L_{t}}} P (T^{i} = B) Δ_{B}^{* i} (x))^{*} \leq \sum_{B \in B_{L_{t}}} P (T^{i} = B) Δ_{B}^{* i} (x)$

Further, by the n-Lemma and Lemma 2 (probability of x is either $0$ or over $n_{t} ϵ_{t}$ ), we can replace $Δ_{B}^{* i} (x)$ by ${(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{i}} Δ_{B}^{i} (x)$ or ${(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{i}} Δ_{B}^{i} (x)$ respectively, which are then upper and lower-bounded by ${(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} Δ_{B}^{i} (x)$ and ${(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t}} Δ_{B}^{i} (x)$ , which, because the fraction doesn't depend on $i$ , can be pulled out of the sum, yielding the stated bounds.

In the next theorem, we will be abbreviating $\sum_{B \in B_{l (i)}} P (T^{i} = B) Δ_{B}^{i} (x)$ as $a_{i}$ and abbreviating $Δ^{i} (x)$ as $c_{i}$ , so the value of a trader at timestep $t$ in world $x$ against the sequence of $OI$ distributions is $- t + \sum_{1 \leq i \leq t} \frac{a_{i}}{c_{i}}$ .

Theorem 1: If a trader exploits the market, there is a budgeted version that trader that exploits the market.

As before, the net value of a trader at timestep $t$ in world $x$ against the sequence of $OI$ distributions is $- t + \sum_{1 \leq i \leq t} \frac{a_{i}}{c_{i}}$ . Because the trader exploits the market, for all $t$ and all $x$ plausible at $t$ , the net value $\geq - b$ for some integer constant $b$ , we get that for all t and all $x$ plausible at $t$ , $\sum_{1 \leq i \leq t} \frac{a_{i}}{c_{i}} \geq t - b$ . Call this quantity the gain of the trader. Separate the sum into "good" days where $a_{i} \geq t ϵ_{t}$ , and "bad" days where this is false, and use Lemma 2 to get

$\sum_{g o o d} \frac{a_{i}}{c_{i}} \geq t - b - \sum_{b a d} \frac{a_{i}}{c_{i}} > t - b - t \frac{t ϵ_{t}}{n_{t} ϵ_{t}} \geq t - b - t^{2} 2^{- (t + 4)} > t - b - \frac{5}{64}$

Abbreviate ${(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t}}$ as $y_{t}$ , and abbreviate ${(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t} + 1}$ as $z_{t}$ . Using Lemma 3, the worst-case assessed gain of the trader is $\geq \sum_{1 \leq i \leq t} \frac{max (y_{t} a_{i} - ϵ_{t}, 0)}{c_{i} + ϵ_{t}} \geq \sum_{g o o d} \frac{y_{t} a_{i} - ϵ_{t}}{c_{i} + ϵ_{t}}$ . Focusing on an individual day, and using the fact that $c_{i} \geq n_{t} ϵ_{t}$ , and it's a good day so $a_{i} \geq t ϵ_{t}$ , we get

$\frac{y_{t} a_{i} - ϵ_{t}}{c_{i} + ϵ_{t}} = \frac{c_{i}}{c_{i} + ϵ_{t}} \frac{a_{i}}{c_{i}} (y_{t} - \frac{ϵ_{t}}{a_{i}}) > \frac{n_{t}}{n_{t} + 1} (y_{t} - \frac{1}{t}) \frac{a_{i}}{c_{i}} > (\frac{n_{t} - 1}{n_{t} + 1} y_{t} - \frac{1}{t}) \frac{a_{i}}{c_{i}} = (z_{t} - \frac{1}{t}) \frac{a_{i}}{c_{i}}$

Substituting this into the sum and pulling out terms that don't depend on $i$ , we get

$\sum_{g o o d} \frac{y_{t} a_{i} - ϵ_{t}}{c_{i} + ϵ_{t}} > (z_{t} - \frac{1}{t}) \sum_{g o o d} \frac{a_{i}}{c_{i}} > (z_{t} - \frac{1}{t}) (t - b - \frac{5}{64}) > t z_{t} - b - \frac{5}{64} - 1$

The last step was done by $z_{t} < 1$ and ignoring some positive terms. Now we will bound $t z_{t}$ .

$(L_{t} + 1) \int_{n_{t} - 1}^{n_{t} + 1} \frac{d x}{x} < \frac{2 (L_{t} + 1)}{n_{t} - 1} \leq \frac{8 L_{t}}{n_{t}} = 2^{- (t + 1)} \leq \frac{1}{4 t} < \int_{t - 1 / 4}^{t} \frac{d x}{x}$

Multiplying both sides by $- 1$ , which flips the upper and lower bounds of integration, we get

$ln ({(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t} + 1}) = (L_{t} + 1) (ln (n_{t} - 1) - ln (n_{t} + 1)) = (L_{t} + 1) \int_{n_{t} + 1}^{n_{t} - 1} \frac{d x}{x} > \int_{t}^{t - 1 / 8} \frac{d x}{x} = ln (t - 1 / 4) - ln (t) = ln (\frac{t - 1 / 4}{t}) = ln (1 - \frac{1 / 4}{t})$

By putting both sides in an exponent, this inequality is equivalent to ${(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t} + 1} > 1 - \frac{1 / 4}{t}$ . Multiplying both sides by $t$ and using the definition of $z_{t}$ , we get $t z_{t} > t - \frac{1}{4}$ .

Using the previous inequalities, we get that the worst-case assessed gain of the trader is greater than $t - b - \frac{1}{4} - \frac{5}{64} - 1 = t - b - \frac{85}{64}$ . Therefore, the worst-case assessed value of the trader is greater than $- b - \frac{85}{64}$ , so when the budget is $b + 3$ , $OverBudget$ will at worst evaluate the value as $- b - \frac{85}{64} > - (b + 3) + 1$ , so it will never return $1$ and interfere with the trade, so the budgeted trader makes the same trades as the original trader and exploits the market.

Lemma 4: The worst-case value of a trader with a budget of $b$ is $\geq - b - 1 / 4$ .

Assume the value of a trader equals $- b$ for some $b \in R^{\geq 0}$ on some turn $t$ . Then the gain of the trader, $\sum_{1 \leq i \leq t} \frac{a_{i}}{c_{i}}$ is $\leq t - b$ . By Lemma 3, the assessed gain is less than

${(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} \sum_{1 \leq i \leq t} \frac{a_{i}}{c_{i}} = {(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} (t - b) < t {(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} - b$

Now we will bound the last term.

$ln ({(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}}) = L_{t} (ln (n_{t} + 1) - ln (n_{t} - 1)) = L_{t} \int_{n_{t} - 1}^{n_{t} + 1} \frac{d x}{x} < \frac{2 L_{t}}{n_{t} - 1} \leq \frac{4 L_{t}}{n_{t}} = 2^{- (t + 2)} \leq \frac{1 / 4}{t + 1} < \frac{1 / 4}{t + 1 / 4} < \int_{t}^{t + 1 / 4} \frac{d x}{x} = ln (t + 1 / 4) - ln (t) = ln (\frac{t + 1 / 4}{t}) = ln (1 + \frac{1 / 4}{t})$

By putting both sides in an exponent, we get the equivalent inequality ${(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} < 1 + \frac{1 / 4}{t}$ and multiplying both sides by $t$ yields $t {(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}} < t + 1 / 4$ , so the assessed gain is less than $t - b + 1 / 4$ and the assessed value is $\leq - b + 1 / 4$ .

Then, if the trader has a budget of $b$ (an integer), the worst-case scenario is that on some turn t it has a true value of $- b + 3 / 4$ and it is assessed to have a value of $- b + 1$ (maximum possible), so it barely passes the budgeting filter, and then loses $1$ dollar on the next turn, getting a final value of $- b - 1 / 4$ (after which it only outputs $B_{⊤}$ which has a value of $0$ ).

Theorem 2: If a budgeted trader exploits the market, the supertrader exploits the market.

$P_{i} (A \land b) = P_{i} (A) P_{i} (b)$ is the probability of the supertrader drawing a bitstring $a$ which encodes the algorithm $A$ and budget $b$ on timestep $i$ . $P (A_{b}^{i} = B)$ is the probability of $A (i)$ with a runtime and oracle call filter and a budget filter of $b$ outputting the boolean $B$ . $P (A)$ is the probability of a randomly selected infinite string encoding $A$ , and $P (b) = 2^{- b}$ . For sufficiently large $i$ , $P_{i} (A) = P (A)$ and $P_{i} (b) = P (b)$ .

A useful thing to note is that, because the bitstring $a$ that is chosen is of length $f (t)$ , and it is only run on the UTM for $f (t)$ steps, its behavior is identical to that of all algorithms that are encoded by a bitstring that has $a$ as a prefix. Therefore, even though the probability of some $A$ being drawn may be $0$ on a timestep, there is some amount of probability mass in the supertrader that might as well have come from $A$ , in the sense that its behavior is indistinguishable from $A$ after the runtime and oracle call filter is applied.

Similarly, because the maximum selectable budget is $f (t)$ , and a trader can lose at most $1$ dollar each turn, for all $A$ and positive integers $c$ , $A_{f (i)}^{i}$ has the same exact behavior as $A_{f (i) + c}^{i}$ , so we can pretend we are dealing with the full distribution over all budgets.

Therefore, for all $i$ , the distribution over satisfiable booleans given by $\sum_{A, b} P_{i} (A \land b) P (A_{b}^{i} = B)$ equals the distribution over satisfiable booleans given by $\sum_{A, b} P (A) P (b) P (A_{b}^{i} = B)$ , and because of this,

$\forall i \forall x \in {0, 1}^{L_{i}} : \sum_{A, b} P_{i} (A \land b) \sum_{B \in B_{L_{i}}} P (A_{b}^{i} = B) Δ_{B}^{i} (x) = \sum_{A, b} P (A) 2^{- b} \sum_{B \in B_{L_{i}}} P (A_{b}^{i} = B) Δ_{B}^{i} (x)$

Also, let $V_{i} (A_{b})$ be the value of $A (i)$ 's trade on day $i$ according to some fixed $x$ . $V_{\leq i} (A_{b})$ is the value that $A_{b}$ accumulates over all days up to $t$ .

The value of the supertrader at time $t$ according to a world $x$ that is plausible at that time is

$\sum_{1 \leq i \leq t} (- 1 + \frac{\sum_{A, b} P_{i} (A \land b) \sum_{B \in B_{L_{i}}} P (A_{b}^{i} = B) Δ_{B}^{i} (x)}{Δ^{i} (x)}) = \sum_{1 \leq i \leq t} \sum_{A, b} P (A) 2^{- b} (- 1 + \frac{\sum_{B \in B_{L_{i}}} P (A_{b}^{i} = B) Δ_{B}^{i} (x)}{Δ^{i} (x)}) = \sum_{1 \leq i \leq t} \sum_{A, b} P (A) 2^{- b} V_{i} (A_{b}) = \sum_{A, b} P (A) 2^{- b} V_{\leq t} (A_{b})$

Now the worst-case value of $V_{\leq t} (A_{b})$ is $- b - \frac{1}{4}$ by Lemma 4, so we get that the value of the supertrader is bounded below by

$\sum_{A} P (A) \sum_{b} 2^{- b} (- b - 1 / 4) = \frac{- 9}{4} \sum_{A} P (A) = \frac{- 9}{4}$ .

Also, if our exploiting trader and the budget which ensures its trade is never altered are $T$ and $b^{'}$ , we get that the value of the supertrader equals

$\sum A, b P (A) 2^{- b} V_{\leq t} (A_{b}) = P (A) 2^{- b^{'}} V_{\leq t} (T_{b^{'}}) + \sum A, b \neq T, b^{'} P (A) 2^{- b} V_{\leq t} (A_{b}) > P (A) 2^{- b^{'}} V_{\leq t} (T_{b^{'}}) - \frac{9}{4}$

Because $V_{\leq t} (T_{b^{'}})$ is unbounded above for appropriate choices of $t$ and $x$ by assumption, it continues to be unbounded above when multiplied by $P (A) 2^{- b^{'}}$ and has $\frac{9}{4}$ subtracted from it. Therefore, the supertrader has plausible value unbounded above as well.

Theorem 3: The supertrader doesn't exploit $OI$ .

Now, we will show that the maximum value that the supertrader can possibly get in worlds plausible at some turn $t$ is upper-bounded by $2^{- t}$ , so the value of the supertrader is upper-bounded by $1$ .

To begin with, the probability mass the supertrader places on world $x$ at timestep $t$ is $\sum_{A, b} P (A) 2^{- b} \sum_{B \in B} P (A_{b}^{t} = B) Δ_{B} (x)$ . This sum can be rewritten as $\sum_{B \in B} Δ_{B} (x) \sum_{A, b} P (A) 2^{- b} P (A_{b}^{t} = B)$ , and for brevity, from here on, we will abbreviate $\sum_{A, b} P (A) 2^{- b} P (A_{b}^{t} = B)$ as $P_{t} (B)$ . With this abbreviation, we can express the value of the supertrader's trade on day $t$ and world $x$ as $\frac{\sum_{B \in B} P_{t} (B) Δ_{B} (x)}{δ (t) 2^{- L_{t}} + (1 - δ (t)) \sum_{B \in B} P_{t} (B) Δ_{B}^{*} (x)} - 1$ . Note that the numerator of the fraction is the supertrader, which uses the true conditional distribution, while the denominator is what the actual distribution is composed of, a small fragment of a uniform distribution with the rest is composed of a mixture of approximations to the appropriate conditional distribution. To begin the bounding argument, apply the n-Lemma to yield

$\frac{\sum_{B \in B} P_{t} (B) Δ_{B} (x)}{δ (t) 2^{- L_{t}} + (1 - δ (t)) \sum_{B \in B} P_{t} (B) Δ_{B}^{*} (x)} - 1 < \frac{\sum_{B \in B} P_{t} (B) Δ_{B} (x)}{(1 - δ (t)) {(\frac{n_{t} - 1}{n_{t} + 1})}^{L_{t}} \sum_{B \in B} P_{t} (B) Δ_{B} (x)} - 1 = \frac{{(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}}}{1 - δ (t)} - 1$

Also,

$ln ({(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}}) - ln (1 - δ (t)) = L_{t} (ln (n_{t} + 1) - ln (n_{t} - 1)) - ln (1 - δ (t)) + ln (1) = L_{t} \int_{n_{t} - 1}^{n_{t} + 1} \frac{d x}{x} + \int_{1 - δ (t)}^{1} \frac{d x}{x} < \frac{2 L_{t}}{n_{t} - 1} + \frac{δ (t)}{1 - δ (t)} < \frac{4 L_{t}}{n_{t}} + 2 δ (t) = 2^{- (t + 2)} + 2^{- (t + 2)} = 2^{- (t + 1)} < \frac{2^{- t}}{1 + 2^{- t}} < \int_{1}^{1 + 2^{- t}} \frac{d x}{x} = ln (1 + 2^{- t}) - ln (1) = ln (1 + 2^{- t})$

By putting both sides in an exponent, this inequality is equivalent to $\frac{{(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}}}{1 - δ (t)} < 1 + 2^{- t}$ , so $\frac{{(\frac{n_{t} + 1}{n_{t} - 1})}^{L_{t}}}{1 - δ (t)} - 1 < 2^{- t}$ , so the value of the supertrader gained on any given day is bounded above by $2^{- t}$ , so the total value of the supertrader at any time is bounded above by $1$ .

Runtime Analysis:

The strength of the bounded reflective oracle needed is controlled by the longest runtime of the algorithms that the oracle is queried about. The relevant oracle calls are the invocations of SAT in $NextBit$ , $eval(B',OI(t))$ in the binary search portion of $NextBit$ , $OverBudget$ , the SAT call in $TradeToBool$ , the $eval(B,OI(i))$ calls that the trader may make, and the oracle calls in the binary search portion of $OverBudget$ .

To begin with, the runtime of $eval(B,x)$ is the time needed to flip enough coins to get to the most distant variable index in $B$ (call it $n$ ), and then the time to run the boolean circuit on the padded $x$ itself, which would be be $O (n \cdot | B |)$ , to check each bit of the padded $x$ and do a pass over the boolean to substitute $⊤$ or $⊥$ for that variable, and then evaluating the resulting boolean with variables filled in.

Similarly, the runtime of SAT is the time to write down $B$ and the binary string corresponding to the probability bound, which has a length of approximately $n$ , for a runtime of $O (n + | B |)$ .

Now we can look at the runtime of $BinSearch$ . It is always called with $D = d (t) = O (l (t)),$ and there are that many iterations of writing down $Δ$ (which, everywhere it's invoked, takes less than $d (t)$ bits) and the average of the two bounds, which, by a suitable binary encoding, takes at most about $d (t)$ bits, so we get a runtime of about $d (t)^{2}$ , which is $O (l (t)^{2})$ .

As for the runtime of $NextBit$ , the length of the boolean at the start is at most $f (t) + O (l (t))$ (the first part is because $B$ was returned by an algorithm that ran for at most $f (t)$ steps, and the second part is the bound on the length of $x$ ). A boolean of this length is written down 3 times, and SAT is invoked twice, and $BinSearch$ is run twice, for a runtime of $O (3 (f (t) + l (t)) + 2 (l (t) + f (t) + l (t)) + 2 l (t)^{2}) = O (f (t) + l (t)^{2})$ . Adding in the time to write in/read $B$ and $x$ from the input just adds another $f (t) + l (t)$ term which doesn't affect runtime that much.

Now for $Approx$ . There are at most $l (t)$ iterations of $NextBit$ , and we already accounted for the time to write the input to $NextBit$ , so the runtime is $O (f (t) l (t) + l (t)^{3})$ .

$TradeToBool$ reads the input, which takes $O (f (t))$ time, emulates $f (t)$ steps of a turing machine, which takes $O (f (t) log f (t))$ time, clips the input which takes $O (f (t) + l (t))$ time (the latter is an upper bound on the time to compute $l (t)$ because $l$ is time-constructible), calls SAT on a boolean of length at most $f (t)$ with a maximum index of $l (t)$ , and writes the boolean, for a runtime of $O (f (t) log f (t) + l (t))$ .

$BTtoBool$ reads the input and writes parts of it into the query, which takes $O (f (t))$ time, writes down the probability bound in $O (l (t))$ time, and runs $TradeToBool$ , for a runtime of $O (f (t) log f (t) + l (t))$ .

$OI$ either writes down $δ (t)$ in $O (t)$ time and generates a string in $O (l (t))$ time, or computes $f (t)$ , $l (t)$ , and $d (t)$ in $O (f (t) + l (t))$ time, generates the trader and budget string in $O (f (t))$ time, and invokes $TradeToBool$ and $Approx$ for a runtime of $O (l (t)^{3} + f (t) l (t) + f (t) log f (t))$ .

Now we can address the runtime of all the algorithms fed into an oracle query but one. The SAT calls in $NextBit$ ask about $eval(B', λ)$ with a runtime of $O (l (t) + f (t))$ , which is sufficiently low. The queries about $eval(B',OI(t))$ in $BinSearch$ have $eval(B',OI(t))$ running in $O (l (t)^{2})$ time (for the evaluation) plus $O (l (t)^{3} + f (t) l (t) + f (t) log f (t))$ time (to calculate $OI(t)$ ), so the runtime is $O (l (t)^{3} + f (t) l (t) + f (t) log f (t))$ which is sufficiently low. The SAT call in $TradeToBool$ asks about $eval(B, λ)$ with a runtime of $O (l (t) + f (t))$ which is sufficiently low. The SAT call the trader makes about $eval(B,OI(i))$ isn't about a longer-running algorithm than the algorithm invoked in $BinSearch$ , so we're good there as well. Finally, the oracle queries in the binary search portion of $OverBudget$ take $O (l (t)^{2})$ time (for the evaluation), plus the runtime of $Approx$ and the runtime of $TradeToBool$ (in the numerator) or the runtime of $OI$ (in the denominator), for a runtime of $O (l (t)^{3} + f (t) l (t) + f (t) log f (t))$ in both cases, which is sufficiently low.

This just leaves establishing the runtime of $OverBudget$ to ensure that the oracle is strong enough for our needs.

Generating $n$ takes $O (t)$ time. Generating the world $x$ takes $O (l (t))$ time. Running the deductive process takes $O (f (t) + l (t))$ time, and so does writing down the resulting boolean, and the maximum index is $l (t)$ so the SAT invocation takes $O (f (t) + l (t))$ time as well, for a runtime so far of $O (f (t) + l (t))$ . That leaves the sum (computing the < isn't harder than computing the sum in the first place). We are carrying out at most $t$ fraction-additions. Each fraction-addition involves three multiplications, one addition, and two iterations of binary search. The length of the numbers is $l (t)$ , but with each multiplication, they get longer by $l (t)$ , so the maximum length of the numbers is $t l (t)$ . Using the standard inefficient multiplication algorithm for $O (t^{2} l (t)^{2})$ runtime per multiplication, and with addition being faster than multiplication, we get $O (t (t^{2} l (t)^{2} + l (t)^{2})) = O (t^{3} l (t)^{2})$ for the whole sum, to get a runtime of $O (t^{3} l (t)^{2} + f (t))$ , which again is sufficiently low to permit oracle calls.

Thus, an oracle strength of $O (l (t)^{3} + t^{3} l (t)^{2} + f (t) l (t) + f (t) log f (t))$ suffices to make all oracle calls directed to algorithms with permissibly low runtime.

LESSWRONG
LW

Oracle Induction Proofs

5

Ω 3

Notation:

5

Ω 3