Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

In this post, given a finite factored set , we will show how to associate each  with a characteristic polynomial, . We will discuss how to factor these characteristic polynomials, and use these characteristic polynomials to build up to the fundamental theorem of finite factored sets, which associates conditional orthogonality with conditional independence in probability distributions.

 

5.1. Characteristic Polynomials

Definition 28. Given a finite factored set , let  denote the ring of polynomials with coefficients in  and variables in 

Definition 29. Given a finite factored set , a , and an , we write  for the evaluation of  at , computed by replacing each  with .

Definition 30. Given a finite factored set , and a polynomial  denotes the set of all variables  that appear in  is called the support of .

Definition 31. Given a finite factored set  and an , let  be given by .   is called the characteristic polynomial of  (in ).

We will be building up to an understanding of how to factor  into irreducibles. For that, we will first need to give some basic notation for manipulating polynomials in .

Definition 32. Given a finite factored set , an , and a , let  be given by 

Definition 33. Given a finite factored set , an , and a , let  be given by 

Definition 34. Given a finite factored set , an , and a , let  be given by 

Proposition 26. Let  be a finite factored set, and let . Then .

Proof. We start by showing that for all 

Let  be arbitrary. By Proposition 3, if , there must be some  such that . Then, note that . If  were also in , then  would be in both  and , contradicting the fact that these two sets are disjoint. Therefore .

Thus  has exactly one element for each element of , so we have that 

Proposition 27. Let  be a finite factored set, and let  be subsets of . Let  be disjoint subsets of . Let , and let . Then .

Proof. For , let . We will start by showing that , given by , is a well-defined function and a bijection.

First, observe that it follows immediately from the definition that for all , if  we have that , and . Combining these, we get that .

For all , there exists some  such that , and some  such that , and this gives us that . Thus,  is well-defined.

To see that  is surjective, observe that for all , there exists an  such that , and there exist  and  such that , and we have .

To see that  is injective, observe that for , for all . Further,  and  are disjoint. Thus, for all  and .

This means that for all  and , if , then  and . However, every monomial in  or  is just equal to the product of all variables in its support. Thus  and . Thus  is injective, and thus a bijection between  and .

Now, we have that

Proposition 28. Let  be a finite factored set, and let  be a nonempty subset of . If  divides , then , for some  and .

Proof. Let  be a finite factored set, and let  be a nonempty subset of . Let  satisfy . We thus must have 

If there were some , then the degree of  in  would be at least 2, contradicting the definition of  and Corollary 1. Thus, .

There can be no combining like terms, then, in the product . The monomial terms in  are in bijective correspondence to the pairs of monomial terms in  and monomial terms in .

In particular, this means that since all the coefficients in  are equal to , all the coefficients in  must be equal to some , and all of the coefficients in  must be equal to .

Further, for all , if  is nonempty,  must be empty, since otherwise  would contain a term with two factors in , which clearly never happens according to the definition of 

Since  is nonempty, for each  there must be some . Thus at least one of  and  must be nonempty, so exactly one of  and  must be nonempty.

Let  be the set of all  such that  is nonempty. 

For every , every term of  has exactly one factor in . Thus, every term in  has exactly one factor in . These cover all variables in the support of , so each term in  must have total degree .

For each  divides a term in . Since  has no common support with  must also divide a term in . Thus  must be a term in . Conversely, every term in  divides a term in , and thus must be in . Thus every term in  is of the form  for some . Thus 

 

5.2. Factoring Characteristic Polynomials

We will now show how to factor characteristic polynomials into irreducibles. 

Definition 35. Given a finite factored set , and a nonempty subset , let  denote the set of all  such that:

  1.  is nonempty,
  2. , and
  3. there is no nonempty strict subset  such that .

Proposition 29. Let  be a finite factored set, and let  be a nonempty subset of . Then .

Proof. Let  be a finite factored set, and let  be a nonempty subset of . It suffices to show that the sets in  are pairwise disjoint and cover 

We start by showing that the set of all  satisfying  is closed under intersection. Indeed, if  and , then .

Next, observe that . Thus, for all , we can consider . Since  is an intersection of a finite nonempty collection of sets  satisfying , we have that . Further, , so  is nonempty.

Assume for the purpose of contradiction that there is some nonempty strict subset  such that . If , then we have a contradiction by the definition of . If , then note that , so , and  is a nonempty strict subset of  that contains , contradicting the definition of .

Thus  for all , and since , this means that the sets in  cover .

Next, we need to show that the sets in  are pairwise disjoint. Let  be arbitrary distinct elements. We have that , and  is a subset of  and , and thus a strict subset of at least one of them. Thus  is empty.

Thus 

The following two propositions constitute a factorization of  into irreducibles.

Proposition 30. Let  be a finite factored set, and let  be a nonempty subset of . Then .

Proof. Let  be a finite factored set, and let  be a nonempty subset of . Let , and let . For , let .

We will show by induction on  that  for all .

If , the result is trivial, as .

For , observe that  and  are disjoint, and that , thus by Proposition 27, we have . Thus, by induction, we get .

In the case where , this gives that 

Proposition 31. Let  be a finite factored set, and let  be a nonempty subset of . Then  is irreducible for all .

Proof. Let  be a finite factored set, let  be a nonempty subset of , and let .

Assume for the purpose of contradiction that , and that both  and  have nonempty support. By Proposition 28, we have that , for some , and 

We will first need to show that  and  are nonempty and disjoint. They must be nonempty, because  and  have nonempty support. Assume for the purpose of contradiction that . Let  be an element of , and note that for , we have . Thus  must be degree at least 2 in , which contradicts the fact that every variable clearly has degree at most 1 in .

Next, we need to show that . We already know that

Let  be an element of . Given an arbitrary , we have that  if and only if  if and only if  for some  if and only if 

We now have that  and  are disjoint and that . Thus, by Proposition 27, we have that . Thus , so .

Let  be arbitrary, and let . Note that , so there is some  such that . Thus  for all . However, we also have that  for all , so . Since , so . Since  and  were arbitrary elements of , we have that . Since  is a nonempty strict subset of , this contradicts the fact that .

Thus,  is irreducible for all 

 

5.3. Characteristic Polynomials and Orthogonality

We can now give an alternate characterization of conditional orthogonality in terms of divisibility of characteristic polynomials.

Lemma 3. Let  be a finite factored set, and let  be partitions of . The following are equivalent.

  1.  .
  2.  divides  for all , and .
  3.   for all , and .

Proof. Clearly condition 3 implies condition 2. We will first show that condition 1 implies condition 3, and then show that condition 2 implies condition 1.

Let , and let  satisfy . Consider an arbitrary , and . We want to show that 

Let . Clearly . We thus have that , so . We also have that , so . These two together give that .

Since , we have that . Thus, by Proposition 27, we have that . Similarly, since , we have that .

Since , and , we have . We also have that

Thus .

By Proposition 27, this gives that .

Finally, since , we have that .

Thus,  and  are both equal to .

Thus, condition 1 implies condition 3. It remains to show that condition 2 implies condition 1.

Fix , and , and let  divide  for all , and . Assume for the purpose of contradiction that it is not the case that . Thus, there exists some  such that . Let  and  satisfy .

Let  be such that  and , and let . Thus,  is an irreducible factor of .

Either  divides  for all  or  divides  for all , since otherwise there would exist an  and a  such that  divides neither  nor , but does divide their product, contradicting the fact that  is irreducible, and thus prime.

Assume without loss of generality that  divides  for all . Fix an . Let us first restrict attention to the case where  is nonempty. 

Let . By Proposition 28,  and  for some  and . We will show that , and .

Let  be an element of . Then for all  if and only if  if and only if  if and only if . Thus 

For all , we have  and , so , so . Similarly, for all , so , so . Thus .

Since  and  both have all coefficients equal to , we have . Thus, 

Similarly, since all the coefficients of  are  and all the coefficients of  are , all the coefficients of  are , so . Thus, .

We thus have that .

In the case where  is empty, we also have , since both sides are .

By Proposition 27, . Thus, , so .

Since  for all , we have that . However, this contradicts the fact that , and .

Thus, condition 2 implies condition 1. 

 

5.4. Probability Distributions on Finite Factored Sets

The primary purpose of all this discussion of characteristic polynomials has been to build up to thinking about the relationship between orthogonality and probabilistic independence. We will now discuss probability distributions on finite factored sets.

Recall the definition of a probability distribution.

Definition 36. Given a finite set , a probability distribution on  is a function  such that 

  1.   for all ,
  2.  ,
  3.  , and
  4.  whenever  satisfy .

A probability distribution on a finite factored set  is a probability distribution on its underlying set that also satisfies another condition, which represents the probability distribution coming from a product of distributions on the underlying factors.

Definition 37. Given a finite factored set , a probability distribution on  is a probability distribution  on  such that for all , we have .

Proposition 32. Given a finite factored set , a probability distribution on  is a probability distribution  on  if and only if  for all .

Proof. If  for all , in particular this means that  for all .

Conversely, if  for all , then for all 

 

5.5. The Fundamental Theorem of Finite Factored Sets

We are now ready to state and prove the fundatmental theorem of finite factored sets.

Theorem 3. Let  be a finite factored set, and let  be partitions of . Then  if and only if for all probability distributions  on  and all , and , we have .

Proof. We already have by Lemma 3 that if , then for all , and . Thus for any probability distribution  on , we have

Conversely, assume that for all probability distributions  on , and all , and , we have .

If  is empty, then  is the unique partition of , and we have . Thus, we can restrict our attention to the case where  is nonempty. 

Fix an arbitrary , and . Let . We will first show that  for all .

Given an arbitrary , we can define  by , and we will show that  is a distribution on .

 is well-defined because  is a nonempty sum of products of positive real numbers, and thus positive. Further, since  is a sum of products of positive real numbers,  for all . Since , we also have . Clearly . Finally, for all  with , we have

Therefore  is a distribution on . We still need to show that  is a distribution on .

Observe that for all  and , since , we have that , and since , we have that . Thus, we have that

Thus, for all 

Thus  is a distribution on .

It follows that P. We therefore have that

Thus,  is a polynomial that is zero on an open subset of inputs, so  is the zero polynomial. Thus , so . Since , and  were arbitrary, by Lemma 3, we have 

 

In the next two posts, we will introduce temporal inference using finite factored sets, and discuss future potential research directions.

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 9:50 PM

In the proof of Lemma 3, it should be 

"Finally, since , we have that .

Thus,  and  are both equal to .

instead.

Fixed, Thanks.