Ω 4

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

The previous proofs are here.

Proposition 5: If , then the condition "there is a  where, " is equivalent to "there is a compact  s.t. "

Proof sketch: One direction is immediate from the Compactness Lemma. For showing that just a bound on the  values suffices to be contained in a compact set, instead of a bound on the  and  values to invoke the Compactness Lemma, we use a proof by contradiction where we can get a bound on the  values of the minimal points from just a bound on the  values.

Proof: In one direction, assume there's a compact  s.t. , and yet there's no upper-bounding  on the  values. This is impossible by the Compactness Lemma, since .

In the other direction, assume there's a  bound on  for the minimal points. Fix some arbitrary  for the rest of the proof. Now, we will show that all minimal points  have , and  letting us invoke the Compactness Lemma to get that everything is in a suitable compact set . The first bound is obvious. Since  came from a minimal point, it must have  as an upper bound.

For the other one, by contradiction, let's assume that there's a minimal point  where . Then, we can write  as:

The first component,  is our fixed minimal point of interest. The second component is an sa-measure, because , due to the  upper bound on the  value of minimal points. The third component is also a nonzero sa-measure, because  is nonnegative (it came from a minimal point), and by assumption, . Hang on, we wrote a minimal point  as another minimal point , plus two sa-measures (one of which is nonzero), so  can't be minimal, and we have a contradiction.

Therefore, all  have . Now that we have bounds on  and  for minimal points, we can invoke the Compactness Lemma to conclude that everything is in a compact set.

Proposition 6:  only occurs when there's only one minimal point of the form .

Proof: Unpacking the expectations, and in light of Proposition 3,

and

So, take a minimal a-measure  that minimizes . One must exist because we have  and  bounds, so by the Compactness Lemma, we can restrict our attention to an actual compact set, and continuous functions from a compact set to  have a minimum, so there's an actual minimizing minimal point.

must be 0, because otherwise  which contradicts . Further, since , said  must be the lowest  possible amongst minimal points.

So, we have a minimal point of the form  where  is the lowest possible  amongst the minimal points. Any other distinct minimal point must be of the form , where . This other minimal point can be written as , where the latter component is an sa-measure, so it's not minimal. Thus, there's only one minimal a-measure and it's of the form .

Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution, if renormalization doesn't fail.

Proof sketch: Our first order of business is showing that our renormalization process doesn't map anything outside the cone of sa-measures. A variant of this argument establishes that the preimage of a minimal point in  must be a minimal point in , which quickly establishes positive-minimals and bounded-minimals for . Then, we verify the other conditions of a bounded infradistribution. Nonemptiness, closure, and convexity are very easy, upper-closure is shown by adding appropriately-scaled sa-measures such that, after renormalization, they hit whatever sa-measure you want. Then, finally, we just have to verify that our renormalization procedure is the right one to use, that it makes  and .

Proof: First up, we need to show that after renormalization, nothing gets mapped outside the cone of sa-measures. Observe that the renormalization process is injective. If two points are distinct, after a scale-and-shift, they'll still be distinct.

Let B be our original set and  be our renormalized set. Take a point in , given by . Undoing the renormalization, we get .

By decomposition into a minimal point and something else via Theorem 2, we get that

where . Renormalizing back, we get that

, obviously, because  is the minimal  value amongst the minimal points. So, the first component is an a-measure, the second component is an sa-measure, so adding them is an sa-measure, and then we scale by a nonnegative constant, so  is an sa-measure as well.

This general line of argument also establishes positive-minimals and bounded-minimals, as we'll now show. If the  isn't 0, then we just wrote  as

And the first component lies in , but the latter component is nonzero, witnessing that  isn't minimal. So, if  is minimal in , then , so it must be the image of a single minimal point  by injectivity. Ie, the preimage of a minimal point in  is a minimal point in .

Scale-and-shift maps a-measures to a-measures, showing positive-minimals, and the positive scale constant of  just scales up the  upper bound on the  values of the minimal points in , showing bounded-minimals.

For the remaining conditions, nonemptiness, closure, and convexity are trivial. We're taking a nonempty closed convex set and doing a scale-and-shift so it's nonempty closed convex.

Time for upper-completeness. Letting  be our original set and  be our renormalized set, take a point  in . By injectivity,  has a single preimage point . Undoing the renormalization by multiplying by  (our addition of  is paired with  to undo the renormalization on that one), consider  This lies in  by upper-completeness, and renormalizing it back produces , which is in , so  is upper-complete.

That just leaves showing that after renormalizing, we're normalized.

For the other part,

And we're done.

Lemma 6:  is a continuous linear operator.

Proof sketch: First show linearity, then continuity, for the operator that just maps a signed measure through , using some equation-crunching and characterizations of continuity. Then, since  is just the pair of that and the identity function, it's trivial to show that it's linear and continuous.

We'll use  to refer to the function  defined by , where  is a measurable subset of  and . Ie, this specifies what the measure  is in terms of telling you what value it assigns to all measurable subsets of .

We'll use  to refer to the function  given by .

Our first order of business is establishing the linearity of . Observe that, for all measurable , and  being real numbers, and  being signed measures over ,

So,  and we have linearity of .

Now for continuity of . Let  limit to . The sequence  converging to  in our metric on  is equivalent to:

So, if  fails to converge to , then there is some continuous function  that witnesses the failure of convergence. But, because  is a continuous function , then , and also , so:

The key step in the middle is that  limits to , so  limits to , by our characterization of continuity. Thus, we get a contradiction, our  that witnesses the failure of convergence actually does converge. Therefore,  limits to  if  limits to , so  is continuous.

To finish up, continuity for  comes from the product of two continuous functions being continuous ( which we showed already, and  because duh), and linearity comes from:

Proposition 8: If  and  is a continuous function , then

Proposition 9:  is a (bounded) inframeasure if  is, and it doesn't require upper completion if  is surjective.

Proof sketch: Nonemptiness is obvious, and showing that it maps sa-measures to sa-measures is also pretty easy. Closure takes a rather long argument that the image of any closed subset of sa-measures over , through , is closed, which is fairly tedious. We may or may not invoke upper completion afterwards, but if we do, we can just appeal to the lemma that the upper completion of a closed set is closed. Convexity is immediate from linearity of .

For upper completion, we can just go "we took the upper completion" if  isn't surjective, but we also need to show that we don't need to take the upper completion if  is surjective, which requires crafting a measurable inverse function to g via the Kuratowski-Ryll-Nardzewski selection theorem, in order to craft suitable preimage points.

Then we can use LF-Duality to characterize the induced  function, along with Proposition 8, which lets us get positive-minimals, bounded-minimals, and normalization fairly easily, wrapping up the proof.

Proof: Nonemptiness is obvious. For showing that it takes sa-measures to sa-measures, take an , and map it through to get  is an sa-measure, so . Now, we can use Lemma 5 to get:

So the  term is indeed big enough that the image of  is an sa-measure.

For closure, fix a sequence of  limiting to some , with preimage points . Due to convergence of  there must be some  bound on the  preserves those values, so  is an upper bound on the . Since the  are sa-measures,  is a lower bound on the  values. Since  converges to  converges to , so there's a  upper bound on the  values. Further,

So, for all n, , so we have an upper bound on the  and  values. Now we can invoke the Compactness Lemma to conclude that there's a convergent subsequence of the , with a limit point , which must be in  since  is closed. By continuity of  from Lemma 6,  must equal , witnessing that . So,  is closed. Now, if we take upper completion afterwards, we can just invoke Lemma 2 to conclude that the upper completion of a closed set of sa-measures is closed.

Also,  is linear from Lemma 6, so it maps convex sets to convex sets getting convexity.

Now for upper completion. Upper completion is immediate if  isn't surjective, because we had to take the upper completion there. Showing we don't need upper completion if  is surjective is trickier. We must show that  is a surjection from  to .

First, we'll show that  where  is an open subset of  is a measurable subset of . In metrizable spaces (of which  is one), every open set is a  set, ie, it can be written as a countable union of closed sets. Because our space is compact, all those closed sets are compact. And the continuous image of a compact set is a compact set, ie closed. Therefore,  is a countable union of closed sets, ie, measurable.

is a Polish space (all compact metric spaces are Polish), it has the Borel -algebra, and we'll use the function . Note that  is closed and nonempty for all  due to  being a continuous surjection. Further, the set  equals  for all open sets . In one direction, if the point  is in the first set, then there's some point  where . In the other direction, if a point  is in , then there's some point  where  so  is nonempty.

Thus,  is weakly measurable, because for all open sets  of  and  is measurable. Now, by the Kuratowski-Ryll-Nardzewski Measurable Selection Theorem, we get a measurable function  from  to  where  so , and  is an injection.

So, we can push any sa-measure of interest  through  (which preserves the amount of negative measure due to being an injection), to get an sa-measure that, when pushed through  recovers  exactly. Thus, if , and you want to show , just consider

So, since  due to upper-completeness, then  And we have shown upper-completeness of  if  is a surjection.

We should specify something about using LF-Duality here. If you look back through the proof of Theorem 5 carefully, the only conditions you really need for isomorphism are (on the set side)  being closed, convex, and upper complete (in order to use Proposition 2 to rewrite  appropriately for the subsequent arguments, we have these properties), and (on the functional side),  being concave (free),  if  (by proof of Theorem 4, comes from upper completeness), and continuous over  (showable by Proposition 8 that , and the latter being continuous since  is an infradistribution)

It's a bit of a pain to run through this argument over and over again, so we just need to remember that if you can show closure, convexity, upper completeness, and the expectations to be continuous, that's enough to invoke LF-Duality and clean up the minimal point conditions. We did that, so we can invoke LF-Duality now.

Time for normalization. From Proposition 8, the  function we get from  is uniquely characterized as: . So,

and normalization is taken care of.

For bounded-minimals/weak-bounded-minimals, since  is the LF-dual of , we can appeal to Theorem 5 and just check whether  is Lipschitz/uniformly continuous. if , then  according to the sup metric on  and , respectively, which (depending on whether we're dealing with Lipschitzness or uniform continuity), implies that , or  for uniform continuity. So, we get:  (or  for uniform continuity), thus establishing that  and  being sufficiently close means that  doesn't change much, which, by Theorem 5, implies bounded-minimals/weak-bounded-minimals in .

For positive-minimals it's another Theorem 5 argument. If , then , so:  And we have monotonicity for , which, by Theorem 5, translates into positive-minimals on .

Lemma 7: If , then for all decompositions of  into

This is easy. Decompose  into . To derive a contradiction, assume there exists a nonminimal  that decomposes into  where . Then,

Thus, we have decomposed our minimal point into another point which is also present in , and a nonzero sa-measure because there's a nonzero  so our original "minimal point" is nonminimal. Therefore, all decompositions of a minimal point in the mixture set must have every component part being minimal as well.

Proposition 10:

Done.

Proposition 11: A mixture of infradistributions is an infradistribution. If it's a mixture of bounded infradistributions with Lipschitz constants on their associated  functions of , and , then the mixture is a bounded infradistribution.

Proof sketch: Nonemptiness, convexity, upper completion, and normalization are pretty easy to show. Closure is a nightmare.

The proof sketch of Closure is: Take a sequence  limiting to . Since each approximating point is a mixture of points from the , we can shatter each of these  into countably many  . This defines a sequence in each  (not necessarily convergent). Then, we take some bounds on the  and manage to translate them into (rather weak) i-dependent bounds on the  sequence. This lets us invoke the Compactness Lemma and view everything as wandering around in a compact set, regardless of . Then, we take the product of these compact sets to view everything as a single sequence in the product of compact sets, which is compact by Tychonoff's theorem. This is only a countable product of compact metric spaces, so we don't need full axiom of choice. Anyways, we isolate a convergent subsequence in there, which makes a convergent subsequence in each of the . And then, we can ask "what happens when we mix the limit points in the  according to ?" Well, what we can do is just take a partial sum of the mixture of limit points, like the i from 0 to 1 zillion. We can establish that  gets arbitrarily close to the upper completion of a partial sum of the mixture of limit points, so  lies above all the partial sums of our limit points. We show that the partial sums don't have multiple limits, then, we just do one more invocation of Lemma 3 to conclude that the mixture of limit points lies below . Finally, we appeal to upper completion to conclude that  is in our mixed set of interest. Whew!

Once those first 4 are out of the way, we can then invoke Theorem 5 to translate to the  view, and mop up the remaining minimal-point conditions.

First, nonemptiness. By Theorem 5, we can go "hm, the  are monotone on , and  everywhere else, and , so the affine functional  lies above the graph of ". This translates to the point  being present in all the . Then, we can just go: , so we have a point in our  set.

For normalization, appeal to Proposition 10 and normalization for all the  and .

Convexity is another easy one. Take a . They shatter into . Then, we can just go:

and then, by convexity of the , so we wrote  as a mixture of points in .

Upper completion is another easy one, because, if , then you can go

And  by upper completion.

That leaves the nightmare of closure. Fix a sequence  limiting to . You can think of the  as . We can shatter the  into , where  can be thought of as

Now, since  converge to something, there must be an upper bound on the  and  terms of the sequence, call those  and . Now, for all n and all  so, for all n and i, .

Also, for all n and  and reshuffling, we get  which then makes  . Further, due to  being  a sa-measure, , so for all n and i, .

Ok, so taking stock of what we've shown so far, it's that for all i, the sequence  is roaming about within  And, by the Compactness Lemma, this set is compact, since it's got bounds (weak bounds, but bounds nonetheless). Defining

where , we can view everything as one single sequence  wandering around in the product of compact sets. By Tychonoff's theorem (we've only got a countable product of compact metric spaces, so we don't need full axiom of choice, dependent choice suffices), we can fix a convergent subsequence of this, and the projections of this subsequence to every  converge.

Ok, so we've got a subsequence of n where, regardless of i,  converge to some  (by closure of ). How does that help us? We don't even know if mixing these limit points converges to something or runs off to infinity. Well... fix any j you like, we'll just look at the partial sum of the first j components. Also fix any  you please. On our subsequence of interest, the  converge to , and in all i, the  converge to . So, let n be large enough (and in our subsequence) that , and , we can always find such an n.

Now,  is a well-defined point (because it's a finite sum of points plus a convergent sequence as witnessed by the well-definedness of  which breaks down as ) It also lies in the upper completion of the single point . We'll show that this point is close to . Since we're working in a space with a norm,

This will come in handy in the later equations.

So,  is less than  away from the upper completion of the point , which is a closed set (Minkowski sum of a closed and compact set is closed).  can be shrank to 0 with increasing n, so  has distance 0 from the upper completion of said partial sum, and thus lies above the partial sum!

Abbreviating  as , we get that all the  lie in , and are all sa-measures. Thus, if the sequence  converges to a unique point, then said limit point is , and all the , so  would lie in . Further, by Lemma 3, , since that set is compact, so  lies above , and would lie in  by upper-completeness.

So, all that's left to wrap up our closure argument is showing that the sequence  has a single limit point. Since it's wandering around in  which is compact by Lemma 3, there are convergent subsequences. All we have to show now is that all convergent subsequences must have the same limit point.

Assume this is false, and there's two distinct limit points of the sequence , call them  and . Because it's impossible for two points to both be above another (in the minimal-point/adding-points sense), without both points being identical, either , or vice-versa. Without loss of generality, assume . Since the latter is a closed set,  must be  away for some . Fix some j from the subsequence that  is a limit point of, where . There must be some strictly greater  from the subsequence that  is a limit point of.

Further, the  are nonzero. Also, no  can be the 0 point, because , and if , then , which is impossible by normalization. So,  lies strictly below . Also,  lies below , because for all the ,