Adrian Xu’s MSc thesis contains an idea called "radial trace plots" which could be explored further. We care a lot about local geometry but in high dimensions it's not clear that there are meaningful ways to visualize it (the pictures one usually sees of loss landscapes are imo dumb / highly misleading). I think this isn't what you had in mind by "n dimensional data visualizations" but people are very visual, and communicating degeneracy by some actually meaningful visualizations might be something to look into if you’re interested in learning SLT. Daniel Murfet suggested this when I asked him for shovel ready devinterp projects, but I ended up not working on it.
Could you send a link to the "radial trace plots" work? I can't seem to find anything.
I mostly agree with you about the loss landscape issue. It's only looking at loss as a fn of a 2d subspace which is extremely limiting if you're trying to make sense of a space with thousands or millions of dimensions.
in high dimensions it's not clear that there are meaningful ways to visualize it
I think the question of whether or not it is possible to meaningfully visualize it is of prime importance to my work. My intuition is of course that it is possible. Very generally, my goal is to develop a "visual language" which serves as a memory aid, and set of tools for controlling "viewport rotation", then in the same way that a 3d object can be understood as 3d by rotating it in your hands, a m<n dimensional subspace with properties of interest can be found in a region of a distribution, and the md distribution can be understood as md by rotating it and recalling it's rotation dynamics with the help of interaction and visual hints.
But this seems like a difficult thing to communicate just with words. As I mention in this post, it is inspired by the work of Mingwei Li, particularly Grand Tour and UMAP Tour. If you haven't glanced at those that might give you a much better intuition of the direction I'm thinking, but I have many additional ideas I need to document at some point, especially relating to a "visual language".
I had not previously heard about SLT, but it looks like it is very relevant to my interests, so thank you very much for mentioning it. I will definitely work it into my study plans somewhere.
This is the main overview page for my project "ndisp". I hope to keep this page up to date with a brief introduction, external links, and a more in depth description of details and future plans.
Introduction
This is a project to build interactive visualization tools and use them for developing Mechanistic Interpretability (MI) techniques and describing the insights gained with those techniques.
In MI there are two main subjects of analysis: The weights of the network, and the activations when an input is processed into an output[1]. Both can be understood as points living in vector spaces, but this creates a problem, they are high dimensional and humans are not good at understanding spaces with more than 3 dimensions.
The standard answer to this problem is to do analysis that doesn't rely on directly thinking in high dimensional spaces, but I dislike this answer, both because of my natural inclination to reason geometrically, and, as Anscombe's quartet exemplifies, analytical understanding can miss important insights that become apparent with a higher bandwidth view of the distribution. So the approach I wish to focus on is to bite the bullet and extend the human capacity for understanding and interacting with higher dimensional objects and distributions[2].
I think there are many domains where this would be beneficial, especially if woven into domain specific applications, but my main focus is on MI.
This project has been inspired by the work of Mingwei Li, particularly Grand Tour and UMAP Tour, as well as my own thinking. I first extended the Grand Tour application as a student project for a data visualization class and then continued working on it as a directed studies with George Tzanetakis and then as an honours project with Teseo Schneider.
I plan to continue the project by developing and releasing standalone modules, a user friendly web app, and publishing papers describing the tool and mechanistic interpretability results found using it.
Hyperlinks
Details
Background
There are four research papers that I have built off of, which I will give a brief introduction of.
Reinforcement Learning
But first, it would be a good idea to mention Reinforcement Learning. If you aren’t familiar, it is the branch of computer science concerned with Agents that learn to take Actions in an Environment to maximize some reward function. The standard textbook is “Reinforcement Learning” by Richard Sutton and Andrew Barto.
Reinforcement learning is of interest to me because of its relationship to unsupervised methods, and that it is concerned with agentic models, models that take observations as inputs and output actions. Since the goal we want the policy to achieve is specified in the loss function, which is not present at evaluation time, that means there must be some representation of the goal encoded in the weights of the network, or some strategy that leads to the goal, without actually knowing what the goal is.
I think understanding how this kind of thing is encoded into the network weights, and how it gets to that state through the training process, is quite compelling.
Procgen
Procgen introduced a set of procedurally generated games with a common interface to provide the RL research community a standard benchmark and test platform for the development of RL techniques. The “maze” environment is particularly relevant. This environment presents a graphical picture of a maze with a mouse and a cheese. The player on each turn must decide whether the mouse should move up, down, left, or right, in order to get to the cheese.
Goal Mis-gen
“Goal Misgeneralization in Deep Reinforcement Learning” made modifications to Procgen environments and trained policy networks for the sake of distinguishing between goal misgeneralization and capability misgeneralization.
Four definitions are given. A policy is said to be “in-distribution” if it is encountering a situation sufficiently similar to it’s training distribution that it is able to perform well. When encountering a novel situation, it is said to be a “capability misgeneralization” if the policy fails to show the same skill in it’s interaction with the environment. This is distinct from “goal misgeneralization”, where the policy still shows it’s prior skill in goal-directed action, but does not use it to pursue the intended goal and so does not achieve a high reward. Finally, a “robust” agent is one which is able to successfully generalize both the capabilities and goals that it learned in the training environment to the novel environment.
Understanding and Controlling
Grand Tour
In “Visualizing Neural Networks with the Grand Tour”, Mingwei Li introduces an interactive n-d data visualization technique I have come to greatly respect.
My Contributions
Having introduced the work I am building on, and some examples of questions I find motivating, I will now introduce my own work that I am building on in this project.
UVic Seng310: Human Computer Interaction
In Seng310: Human Computer Interaction, my group, Walker Jones, Julian Write and I, isolated a visualization from the Grand Tour into a standalone application and tested untrained users experience with it before and after slight UI improvements.
UVic Csc490: Directed Studies
Over the summer term, I worked with Triston Grayson on a precursor to this project as a Directed Studies course under the supervision of George Tzanetakis. During that time I completed the following work:
UVic Csc499: Honors Project
For this honours project I continued the work I began in the summer term. I:
Rgb-proj
Hyperbrush
Discussion
These preliminary results are too subjective to strongly support any specific conclusion, other than indicating promising research directions. I will nevertheless offer some of my thoughts on the results.
Future Directions
I have suggestions for future work falling into 3 categories: Work exploring the maze solving policy network, work further developing the activation-space exploration tools I have introduced here, and my long term research agenda.
MIRL
The maze solving policy network could be further explored
NDSP & Friends
I would like to develop a general, in-browser, interactive data-visualization application with NDSP, PixCol, and other tools that could be applied to general n-dimensional datasets, as well as better supporting neural network exploration. Brushing and linking multiple representations of the activation space paired with rich interactivity may allow users to gain insight into activation-space and other n-dimensional data-spaces.
I would like to make the visualizations modular, allowing them to be used from the standalone web app, or imported and used from within jupyter or google collab notebooks.
Math & Agent Foundations
For the responsible deployment of autonomous decision making systems, we need better understanding of the ML models that underlie these systems. I believe we can develop a strong mathematical understanding of these strange computational artifacts and the Socio-technical systems that come to surround them. But it will require a great deal of effort from many skilled scientists.
My long term research agenda has main focuses:
Acknowledgements
Thank you to all the researchers who have inspired, corresponded, or collaborated with me. This includes, but is not limited to: Mingwei Li, Alex Turner, Ulisse Mini, Carlos Scheidegger, and Triston Grayston. Also thank you to Teseo Schneider and George Tzanetakis for supervising my CSC499 and CSC490 projects at the University of Victoria.
Analysis of the network architecture and the distribution of inputs, outputs and activations also seem worth mentioning.
I want to clearly state that I value analytical approaches. Indeed, only through logic can strong proofs be constructed, but, to build the intuition required to approach proper logic, visualization and interaction are invaluable.