Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices
Here are some empirical observations that I have made on August 14, 2023 to August 19, 2023 that are characteristics of the interpretability of my own matrix dimensionality reduction algorithm. These phenomena that we observe do not occur on all inputs (they sometimes occur partially); and it would be nice if there were a more complete mathematical theory with proofs that explains these empirical phenomena. Given a (possibly directed and possibly weighed) graph with n nodes represented as a collection of n×n-matrices A1,…,Ar, we will observe that a dimensionality reduction (X1,…,Xr) of (A1,…,Ar) where each Xi is a d×d-matrix (I call this dimensionality reduction an LSRDR) is in many cases the optimal solution to a combinatorial problem for the graph. In this case, we have a complete interpretation of what the dimensionality reduction algorithm is doing. For this post, let K denote either the field of real numbers or the field of complex numbers (everything also works well when K is the division ring of quaternions). Notation: ρ(A)=limn→∞∥An∥1/n=max{|λ|:λis an eigenvalue ofA} is the spectral radius of the matrix A. AT denotes the transpose of A while A∗ denotes the adjoint of A. We say that a tuples of matrices (A1,…,Ar) is jointly similar to (B1,…,Br) if there is an invertible C with Bj=CAjC−1 for 1≤j≤r. A⊗B denotes the tensor product of A with B. Let 1≤d<n. Suppose that A1,…,Ar are n×n-matrices with entries in K. Then we say that a collection (X1,…,Xr) of d×d-matrices with entries in K is an L2,d-spectral radius dimensionality reduction (abbreviated LSRDR) of A1,…,Ar if the following quantity is locally maximized:ρ(A1⊗(X∗1)T+⋯+Ar⊗(X∗r)T)ρ(X1⊗(X∗1)T+⋯+Xr⊗(X∗r)T)1/2. LSRDRs may be computed using a variation of gradient ascent. If (X1,…,Xr) is an LSRDR of (A1,…,Ar), then one will typically be able to find matrices R,S,P and some λ∈K where Xj=RAjS for 1≤j≤r and where SR=λPand P2=P. We shall call P a L2,d-SRDR projection operator of (A1,…,Ar). (X1⊕0n−d,…,Xr⊕0n−d)
Yes. When we take convex combinations of finitely many point mass measures, the integral is just a sum. I use the sum of finitely many elements for ease of calculations, but to prove theorems, I should use measures for full generality.
The idea of finding an object A along with distinct local optima GA(x1),…,GA(xn) with n maximized looks like an interesting problem to work on. I have not worked on this kind of objective before, but I can certainly try this, as I have a few ideas of how to do this. This might work better for discrete optimization problems though since I cannot think of a good way to use gradient updates to produce new local optima. In... (read 563 more words →)