Interpreting a matrix-valued word embedding with a mathematically proven characterization of all optima

I appreciate your input. I plan on making more posts like this one with a similar level of technical depth. Since I included a proof with this post, this post contained a bit more mathematics than usual. With that being said, others have stated that I should be aware of the mathematical prerequisites for posts like this, so I will keep the mathematical prerequisites in mind.

Here are some more technical thoughts about this.

We would all agree that the problem of machine learning interpretability is a quite difficult problem; I believe that the solution to the interpretability problem requires us not only to use better interpretability tools, but the machine learning models themselves need to be more inherently interpretable. MPO word embeddings and similar constructions have a little bit (but not too much) of difficulty since one needs to get used to different notions. For example, if we use neural networks using ReLU activation (or something like that), then one has less difficulty upfront, but when it comes time to interpret such a network, the difficulty in interpretability will increase since neural networks with ReLU activation do not seem to have the right interpretability properties, so I hesitate to interpret neural networks. And even if we do decide to interpret neural networks, the interpretability tools that we use may have a more complicated design than the networks themselves.
There are some good reasons why complex numbers and quaternions have relatively little importance in machine learning. And these reasons do not apply to constructions like MPO word embeddings.
Since equal norm tight frames are local minimizers of the frame potential, it would help to have a good understanding of the frame potential. For simplicity, it is a good idea to only look at the real case. The frame potential is a potential for a force between a collection of particles on the sphere where particles are repelled from each other (and from each other's antipodal point) and where the force tries to make all the particles orthogonal to each other. If $d = n$ , then it is possible to make all of the particles orthogonal to each other, and in this case, when we minimize this potential, the equal norm tight frames will simply be orthonormal bases. In the case when $d < n$ , we cannot make all of the particles orthogonal to each other, but we can try to get as close as possible. Observe that unlike the Newtonian and logarithmic potential, the frame potential does not have a singularity for when the two particles over lap. I will leave it to you to take the gradient (at least in the real case) of the frame potential to see exactly what this force does to the particles.
Training an MPO word embedding with the complex numbers of quaternions is actually easier in the sense that for real MPO word embeddings, one needs to use a proper initialization, but with complex and quaternionic MPO word embeddings, an improper initialization will only result in minor deficiencies in the MPO word embedding. This means that the quaternions and complex numbers are easier to work with for MPO word embeddings than the real numbers. In hindsight, the solution to the problem of real MPO word embeddings is obvious, but at the time, I thought that I must use complex or quaternionic matrices.
I like the idea of making animations, but even in the real case where things are easy to visualize, the equal norm tight frames are non-unique and they may involve many dimensions. The non-uniqueness will make it impossible to interpret the equal norm tight frames; for the same reason, it is hard to interpret what is happening with neural networks since if you retrain a neural network with a different initialization or learning rate, you will end up with a different trained network, but MPO word embeddings have much more uniqueness properties that make them easier to interpret. I have made plenty of machine learning training animations and have posted these animations on YouTube and TikTok, but it seems like in most cases, the animation still needs to be accompanied by technical details; with just an animation, the viewers can see that something is happening with the machine learning model, but they need both the animation and technical details to interpret what exactly is happening. I am afraid that most viewers just stick with the animations without going into so many technical details. I therefore try to make the animations more satisfying than informative most of the time.

[-]the gears to ascension2y20

this does look like it might be interesting, but I think you might need to show - possibly visually - why this works. what does an embedding of a 5-word dataset look like on a chart? how does one interpret it on that chart? why do these expressions map to that chart, etc? that would allow introducing the math you're using to anyone who doesn't know it or is rusty, while reducing effort to follow it for those who don't know it. If you're only intending to communicate to those who already get the prereqs, which is a thing people do, then, well, post more posts like this and I'm sure someone who has the particular math background (quaternions?) will run into them eventually. it looks like the math is all here, just too dense for my current level of motivation, so maybe you just need the right eyes.

I personally can't evaluate your idea within the time I would allot to reading this post because you use a lot of expressions I'm not immediately familiar with and I don't see a way to shortcut through them in order to draw the conclusions about improved interpretability you're implying. but it does seem conceivable that it could be pretty cool. I can imagine why having explicitly disambiguated word senses would be useful, if you could get your embedding to be sturdy about them.

[-]Joseph Van Name2y10

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

3

Interpreting a matrix-valued word embedding with a mathematically proven characterization of all optima

3

3