Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

% operators that are separated from the operand by a space

% operators that require brackets

% operators that require parentheses

% Paper specific

Logical induction is defined in terms of logical sentences and theories, but its principles are applicable in much greater generality and abstraction. Indeed, one such generalization was studied under the name "universal induction." We proposed a slightly different generalization in order to model reasoning with incomplete models. Here, we describe a formalism that includes all these cases and many more, using the language of measure theory. This provides the following advantages:

The formalism is applicable to event spaces substantially different from truth assignments or bit sequences, e.g. we can consider sequences of real numbers.

The formalism treats probabilities and expectations on the same footing, rather than constructing expectations as in section 4.8 of original paper. We consider this more convenient.

In our opinion, this language is more mathematically natural than the original formalism, at least for applications unrelated to formal logic.

On the other hand, we ignore all computational considerations. Obviously these are often important, but in the study of purely information-theoretic questions the use of numerical approximations only serves to obscure.

All proofs are in the Appendix.

##Results

Fix X a compact Polish space. For example, X might be Oω or the space of propositionally consistent truth assignments in some language or [0,1]ω. The role of "pricings" is served by P(X): the space of probability measures on X. A market is thus a sequence {μn∈P(X)}n∈N. The "deductive process" is replaced by a sequence of closed sets X=X0⊇X1⊇X2⊇… A trading strategy is a continuous function τ:P(X)→C(X), were P(X) is equipped with the weak* topology (as before) and C(X) is the space of continuous functions from X to R equipped with the uniform convergence topology. Here, we should think of τ(μ) as the share portfolio acquired by the strategy given pricing μ, where the cost of the acquisition is understood to be Eμ[τ(μ)] while the ultimate value of the portfolio is τ(μ)(x) (in the following we will use the less clumsy notation τ(μ,x); in fact, we can equivalently define T(X) as the set of continuous functions from P(X)×X to R) for some x∈⋂nXn. We denote the set of trading strategies by T(X). A trader is a sequence {Tn:P(X)n→T(X)}n∈N, where the functions can be arbitrary (don't have to be continuous in any sense). The argument of these functions refers to the market pricings on previous days.

Analogously to Lemma 5.1.1 in "Logical Induction" (existence of "market maker"), we have:

#Proposition 1

For any τ∈T(X), there is μ∈P(X) s.t.

Ex∼μ[τ(μ,x)]=maxx∈Xτ(μ,x)

Analogous to Definition 5.2.1 ("budgeter"), we have:

#Proposition 2

Given τ∈T(X), define Wτ:P(X)→C(X) by

Wτ(μ,x):=τ(μ,x)−Ey∼μ[τ(μ,y)]

Fix a trader T. Define ΣWTn:P(X)n→C(X) by

ΣWTn({μm}m<n,x):=∑m<nWTm({μl}l≤m,x)

Define ΣWminTn:P(X)n→R by

ΣWminTn({μm}m<n):=minx∈Xn−1ΣWTn({μm}m<n,x)

(For n=0, the above definition is understood to mean 0)

% operators that are separated from the operand by a space

% operators that require brackets

% operators that require parentheses

% Paper specific

Logical induction is defined in terms of logical sentences and theories, but its principles are applicable in much greater generality and abstraction. Indeed, one such generalization was studied under the name "universal induction." We proposed a slightly different generalization in order to model reasoning with incomplete models. Here, we describe a formalism that includes all these cases and many more, using the language of measure theory. This provides the following advantages:

The formalism is applicable to event spaces substantially different from truth assignments or bit sequences, e.g. we can consider sequences of real numbers.

The formalism treats probabilities and expectations on the same footing, rather than constructing expectations as in section 4.8 of original paper. We consider this more convenient.

In our opinion, this language is more mathematically natural than the original formalism, at least for applications unrelated to formal logic.

On the other hand, we ignore all computational considerations. Obviously these are often important, but in the study of purely information-theoretic questions the use of numerical approximations only serves to obscure.

All proofs are in the Appendix.

##Results

Fix X a compact Polish space. For example, X might be Oω or the space of propositionally consistent truth assignments in some language or [0,1]ω. The role of "pricings" is served by P(X): the space of probability measures on X. A

marketis thus a sequence {μn∈P(X)}n∈N. The "deductive process" is replaced by a sequence of closed sets X=X0⊇X1⊇X2⊇… Atrading strategyis a continuous function τ:P(X)→C(X), were P(X) is equipped with the weak* topology (as before) and C(X) is the space of continuous functions from X to R equipped with the uniform convergence topology. Here, we should think of τ(μ) as the share portfolio acquired by the strategy given pricing μ, where the cost of the acquisition is understood to be Eμ[τ(μ)] while the ultimate value of the portfolio is τ(μ)(x) (in the following we will use the less clumsy notation τ(μ,x); in fact, we can equivalently define T(X) as the set of continuous functions from P(X)×X to R) for some x∈⋂nXn. We denote the set of trading strategies by T(X). Atraderis a sequence {Tn:P(X)n→T(X)}n∈N, where the functions can be arbitrary (don't have to be continuous in any sense). The argument of these functions refers to the market pricings on previous days.Analogously to Lemma 5.1.1 in "Logical Induction" (existence of "market maker"), we have:

#Proposition 1

For any τ∈T(X), there is μ∈P(X) s.t.

Ex∼μ[τ(μ,x)]=maxx∈Xτ(μ,x)

Analogous to Definition 5.2.1 ("budgeter"), we have:

#Proposition 2

Given τ∈T(X), define Wτ:P(X)→C(X) by

Wτ(μ,x):=τ(μ,x)−Ey∼μ[τ(μ,y)]

Fix a trader T. Define ΣWTn:P(X)n→C(X) by

ΣWTn({μm}m<n,x):=∑m<nWTm({μl}l≤m,x)

Define ΣWminTn:P(X)n→R by

ΣWminTn({μm}m<n):=minx∈Xn−1ΣWTn({μm}m<n,x)

(For n=0, the above definition is understood to mean 0)

Fix b>0. Assume n∈N and {μm∈P(X)}m<n are s.t.

ΣWminTn({μm}m<n)>−b

Then, we can define NbTn({μm}m<n):P(X)→R by

NbTn({μm}m≤n):=max(1,maxx∈Xn−WTn({μm}m≤n,x)b+ΣWTn({μm}m<n,x))−1

Finally, define {BbTn:P(X)n×P(X)→C(X)}n∈N by

BbTn({μm}m≤n)={0 if ∃m≤n:ΣWminTm({μl}l<m)≤−b,NbTn({μm}m≤n)⋅Tn({μm}m≤n) otherwise

Then:

BbT is also a trader (i.e. it is continuous in the last argument).

If {μm}m<n is s.t. ∀m≤n:ΣWminTm({μl}l<m)>−b, then ∀m<n:BbTm({μl}l<m)=Tm({μl}l<m).

ΣWminBbTn≥−b

The analogue of Definition 5.3.2 ("trading firm") is as follows:

#Proposition 3

Consider a family of traders {Tk}k∈N and ζ:N×N→[0,1] s.t.

∞∑k=0∞∑b=1ζ(b,k)b=bζ<∞

Define Tζn as follows:

Tζn=n∑k=0∞∑b=1ζ(k,b)BbTkn

Then, the above sum is convergent and defines a trader. Moreover:

ΣWminTζn≥−bζ

The analogue of Definition 5.4.1 (logical inductor construction):

#Proposition 4

Consider the setting of Proposition 3. Then, there are {μ∗n∈P(Xn)}n∈N s.t.

Ex∼μ∗n[Tζn({im∗μ∗m}m≤n,x)]=maxx∈XnTζn({im∗μ∗m}m≤n,x)

Analogously to Definition 3.5.1 ("exploitation") we have:

#Definition

A market μ is said to

dominatea trader T relatively to {Xn} whenDenoting in:Xn→X the inclusion mapping, μn is in the image of in∗.

The following set of real numbers is either unbounded from below or bounded from above:

W(T,μ):={ΣWTn+1({μm}m≤n,x)∣n∈N,x∈Xn}

Finally, the analogue of Theorem 5.4.2:

#Theorem

Consider the setting of Proposition 4, and assume that ∀k,b:ζ(k,b)>0. Then, {in∗μ∗n}n∈N dominates Tk for every k.

##Appendix

#Proposition A.1

Given τ∈T(X), define Eτ:P(X)×P(X)→R by

Eτ(ν,μ):=Eν[τ(μ)]

Then, Eτ is continuous.

#Proof of Proposition A.1

Consider μi→μ and νi→ν. We have

Eνi[τ(μi)]=Eνi[τ(μ)]+Eνi[τ(μi)−τ(μ)]

|Eνi[τ(μi)]−Eνi[τ(μ)]|≤max|τ(μi)−τ(μ)|

By continuity of τ, τ(μi)→τ(μ) and therefore, max|τ(μi)−τ(μ)|→0. We get

limi→∞|Eνi[τ(μi)]−Eνi[τ(μ)]|=0

Since νi→ν and τ(μ) is continuous, we have Eνi[τ(μ)]→Eν[τ(μ)] and therefore

limi→∞Eνi[τ(μi)]=Eν[τ(μ)]

#Proof of Proposition 1

Define K⊆P(X)×P(X) as follows:

K:={(μ,ν)∣Eν[τ(μ)]=maxτ(μ)}

For any μ, denote K(μ):=K∩(μ×P(X)). K(μ) is convex due to linearity of expected value. K(μ) is non-empty because given x∗∈argmaxx∈Xτ(μ,x), δx∈K(μ).

Consider μi→μ and νi→ν s.t. (μi,νi)∈K. We have

Eνi[τ(μi)]=maxτ(μi)

By Proposition A.1, the left hand side converges to Eν[τ(μ)]. Since τ(μi)→τ(μ), the right hand side converges to maxτ(μ). We get:

Eν[τ(μ)]=maxτ(μ)

Therefore, (μ,ν)∈K and K is a closed set. Applying the Kakutani-Glicksberg-Fan theorem, we get the desired result.

#Proof of Proposition 2

Wτ is continuous by Proposition A.1. Therefore, NbT is continuous in the last argument and BbT is also continuous in the last argument.

Assume {μm∈P(X)}m<n is s.t.

∀m≤n:ΣWminTm({μl}l<m)>−b

Then, for any m<n we are in the second case in the definition of BbTm({μl}l<m). Moreover, we have

ΣWminTm+1({μl}l≤m)> −b

∀x∈Xm:ΣWTm+1({μl}l≤m,x)> −b

∀x∈Xm:ΣWTm({μl}l<m,x)+WTm({μl}l≤m,x)> −b

∀x∈Xm:b+ΣWTm({μl}l<m,x)>−WTm({μl}l≤m,x)

Using the assumption again, the left hand side is positive. It follows that

∀x∈Xm+1:1>−WTm({μl}l≤m,x)b+ΣWTm({μl}l<m,x)

NbTm({μl}l≤m)=1

BbTm({μl}l≤m)=Tm({μl}l≤m)

Now, fix any {μm∈P(X)}m<n. Let m0∈N be the largest number s.t. m0≤n and

∀m≤m0:ΣWminTm({μl}l<m)>−b

For any m≤m0, we have

ΣWminBbTm({μl}l<m)=ΣWminTm({μl}l<m)>−b

(Note that the sum in the definition of ΣWminBbTm only involves BbTl for l<m≤m0)

For m=m0+1, we have

ΣWBbTm0+1({μl}l≤m0,x)=ΣWBbTm0({μl}l<m0,x)+WBbTm0({μl}l≤m0,x)

The first term only involves BbTl for l<m0, and we are still in the first case in the definition of the second term, therefore

ΣWBbTm0+1({μl}l≤m0,x)=ΣWTm0({μl}l<m0,x)+NbTm0({μl}l≤m0)WTm0({μl}l≤m0,x)

If x is s.t. WTm0({μl}l≤m0,x)≥0, then

ΣWBbTm0+1({μl}l≤m0,x)≥ΣWTm0({μl}l<m0,x)

ΣWBbTm0+1({μl}l≤m0,x)≥ΣWminTm0({μl}l<m0)>−b

If x is s.t. WTm0({μl}l≤m0,x)<0, then

ΣWBbTm0+1({μl}l≤m0,x)≥ΣWTm0({μl}l<m0,x)+b+ΣWTm0({μl}l<m0,x)−WTm0({μl}l≤m0,x)WTm0({μl}l≤m0,x)

ΣWBbTm0+1({μl}l≤m0)≥−b

We got that ΣWBbTm0+1({μl}l≤m0,x)≥−b for all x∈Xm0, and therefore

ΣWminBbTm0+1({μl}l≤m0)=minx∈Xm0ΣWBbTm0+1({μl}l≤m0,x)≥−b

Finally, consider m>m0+1.

ΣWBbTm({μl}l<m,x)=ΣWBbTm−1({μl}l<m−1,x)+WBbTm−1({μl}l<m,x)

Now we are in the first case in the definition of WBbTm−1, therefore the second term vanishes.

ΣWBbTm({μl}l<m,x)=ΣWBbTm−1({μl}l<m−1,x)

By induction on m, we conclude:

ΣWminBbTm({μl}l<m)≥ΣWminBbTm−1({μl}l<m−1)≥−b

#Proposition A.2

If X,Y are compact Polish spaces and f:X×Y→R is continuous, then F:X→C(Y) defined by F(x)(y):=f(x,y) is continuous.

#Proof of Proposition A.2

We already proved an equivalent proposition: see "Proposition A.1" here.

#Proof of Proposition 3

The definition of Bb implies that

|BbTkn({μm}m≤n,x)|≤|Tkn({μm}m≤n,x)|

Observe that

Z(k):=∞∑b=1ζ(k,b)≤∞∑b=1ζ(k,b)b<∞

As a result, the definition of Tζn is pointwise absolutely convergent:

∞∑b=1ζ(k,b)|BbTkn({μm}m≤n,x)|≤Z(k)|Tkn({μm}m≤n,x)|

Moreover, this series converges uniformly absolutely in μn and x:

∞∑b=b0ζ(k,b)|BbTkn({μm}m≤n,x)|≤maxν∈P(X)y∈X|Tkn({μm}m<n,ν,y)|∞∑b=b0ζ(k,b)→b0→∞0

By the uniform limit theorem, the series defines a continuous function of μn and x. By Proposition A.2, it follows that Tζ is a trader.

Now, let's examine ΣWTζ:

ΣWTζn({μm}m≤n,x)=ΣWn∑k=0∞∑b=1ζ(k,b)BbTkn({μm}m≤n,x)

Since convergence is uniform absolute, the sum commutes with ΣW.

ΣWTζn({μm}m≤n,x)=n∑k=0∞∑b=1ζ(k,b)ΣWBbTkn({μm}m≤n,x)

By Proposition 2:

ΣWTζn({μm}m≤n,x)≥−n∑k=0∞∑b=1ζ(k,b)b=−ζb

#Proof of Proposition 4

We construct μ∗n recursively in n. Define τn∈T(Xn) by

τn(ν,x):=Tζn({im∗μ∗m}m<n,in∗ν,x)

(pushforward is obviously continuous in the weak* topology)

Now construct μ∗n by applying Proposition 1 to τn.

#Proof of Theorem

We have

WTζn({im∗μ∗m}m≤n,x)=Tζn({im∗μ∗m}m≤n,x)−Ey∼μ∗n[Tζn({im∗μ∗m}m≤n,y)]

By definition of μ∗

WTζn({im∗μ∗m}m≤n,x)=Tζn({im∗μ∗m}m≤n,x)−maxy∈XnTζn({im∗μ∗m}m≤n,y)

∀x∈Xn:WTζn({im∗μ∗m}m≤n,x)≤0

∀x∈Xn:ΣWTζn+1({im∗μ∗m}m≤n,x)≤0

Fix k∈N and assume infW(Tk,i∗μ∗)>−b for some b>0 (otherwise Tk is dominated). Define ξ:N×N→[0,1] by

ξ(j,c):={ζ(j,c) when (j,c)≠(k,b)0 when (j,c)=(k,b)

We get Tζn=Tξn+[[n≥k]]ζ(k,b)BbTk and therefore

∀x∈Xn:ΣWTξn+1({im∗μ∗m}m≤n,x)+[[n≥k]]ζ(k,b)ΣWBbTkn+1({im∗μ∗m}m≤n,x)≤0

By Proposition 2 and the definition of b, we can remove Bb in the second term.

∀x∈Xn:ΣWTξn+1({im∗μ∗m}m≤n,x)+[[n≥k]]ζ(k,b)ΣWTkn+1({im∗μ∗m}m≤n,x)≤0

∀x∈Xn:[[n≥k]]ζ(k,b)ΣWTkn+1({im∗μ∗m}m≤n,x)≤−ΣWTξn+1({im∗μ∗m}m≤n,x)

By Proposition 3, the right hand side is bounded from above by bξ, therefore

∀x∈Xn:[[n≥k]]ζ(k,b)ΣWTkn+1({im∗μ∗m}m≤n,x)≤bξ

supW(Tk,i∗μ∗)≤max(bξζ(k,b),maxn<kx∈XnΣWTkn+1({im∗μ∗m}m≤n,x))