Good luck!
I would say that properly learning new maths takes a long time and it might not be worth trying to seriously study areas that aren’t clearly related to the kind of research you want to do (like category theory).
Like, being a maths undergrad is a full time job and maths undergrads typically learn the equivalent of a few slim textbooks worth of content every few months.
Probably you work more hours and are more driven than your average maths undergrad, but then again you’ll be studying alone and trying to do other kinds of work too. Unless you’re exceptionally driven (or skip all the excercises) then it will be enough of an achievement to study a couple of textbooks a year. So spend them wisely!
Thank you!
I am graduating with a math minor, so like to believe I am aware of how painfully slowly you can move through a textbook with full understanding. I fully agree with you about spending your math points wisely and thanks for the reminder. I do tend to get overly ambitious. If you have a background in math (and AIA) or can point me to others who might be willing to have a zoom call or just a text exchange about how to better focus my math studies I would be very grateful.
Having said that, I do enjoy the study of math intrinsically, so some of the math I look at may be purely for my own enjoyment and I'm ok with that, but it would be good if when I am learning math it can be both enjoyable AND helpful for my future work on AIA. : )
I'm certainly no expert on self-studying maths. I've generally found it easy to pick up a conceptual understanding from skimming textbooks, and for some subjects (e.g. statistics, Bayesian probability, maybe logic) I think that's where most of the value lies. I've never had the drive or made the time to work through a lot of exercises on my own, and I'd guess that for subjects like linear algebra being able to actually work through problems is probably the important part.
So if you have a subject where both (i) it's not clearly relevant, and (ii) getting a useful understanding requires working through a lot of exercises, then I'd probably hold off.
SSJ #2 -- OIS, Neel's MI Guide, Etc...
SSJ #3 -- First worklog, OIS, VK LTA
(Three years ago,) I quit my job as a technologist to get a BSc in Computer Science (with Math Minor) because I want to work on AI Alignment (AIA) and Mechanistic Interpretability (MI). This summer I am taking my final class in my program, so I want to use a Self Study Journal (SSJ) to improve my AIA relevant skills. I hope to get peer and mentor engagement to help me become a valuable researcher, and to network for finding funding opportunities or paid fellowships. My convocation is in November. My goal is to have found a role by that time.
I want feedback for the value of other peoples insight, and also to help keep motivated with extra accountability, so please lower your inhibition to commenting here. If you would normally think "I don't have anything valuable to contribute" or "it would take too long to write up my thoughts" instead, please leave a comment saying "Good Luck". Thanks : )
I am planning a rough, overarching outline and then making more concrete plans for sprints of work each of which will last one or two weeks. After each sprint I will publish the results of the sprint and the plans for the next sprint.
My overarching outline is divided into 5 categories:
I have a few original ideas that I’m not aware of other people working on. I’d like to write up the ideas to help me practice the development and communication of original ideas, as well as to explore whether any of these ideas have merit that I can communicate to others. A good outcome would be any of:
The following is a bullet point list of the articles I’m currently interested in writing. I don’t think they will be fully legible here, but if you are curious, please leave a comment asking about them.
I have been collecting topics I want to get a better understanding of for a long time, but now that my school curriculum is lighter, I will have time to actually dive into these topics. There’s too much here to write up the “what” and “why” of each item, but as I am working through them I will try to provide a summary of my understandings and opinions which I hope will be valuable both for expanding focus on the topics and for checking my own understanding.
I enjoy math, so I know I’m going beyond what is necessary for MI, but I also think having a rigorous definition of what you are talking about is very valuable in many contexts, so for those reasons, I want to learn some new math topics and to review and practice some old ones. The topics I’m interested in are:
I think I may start out by going through “Topoi, The Categorial Analysis of Logic” by Robert Goldblatt and “Linear Algebra Problem Book” by Paul R. Halmos.
The category theory book is because of my interest in logic and proof, and because I find the idea that category theory can help one understand the connections between various branches of math very satisfying. The linear algebra is because I want to have good intuitions about , where Neural Net parameters and activations live.
In the pursuit of becoming an AIA and MI researcher it is important to actually research some AI models. I have worked with convolutional VAE and RL models, but have never worked with transformers. I need to get familiar with them and I also want to get some experience using cloud resources to work with larger models.
I think I’ll start out doing some mucking around which I may or may not write up in much detail before trying to choose some minor MI experiments to try. I will probably also want to combine these efforts with NDSP as I make progress on making a more general tool.
I’m very inspired by Mingwei Li’s work, especially Toward Comparing DNNs with UMAP Tour. I would like to build tools for working with and understanding data distributions in high dimensional spaces. I have two major goals with this project.
(1) Develop some easy to use tools. This could look like a library like matplotlib that can be used from within jupyter notebooks, or it may look more like a web based data analysis tool. Ideally it would have both.
(2) Make high dimensional structures more intuitively understandable. The first aspect of this is developing a visual language for displaying these structures and the second aspect is making tutorials to help people generalize from simple objects such as hyper-cubes, simplexes, and hyper-spheres to more complicated scenes that may appear in actual high dimensional data distributions. I might also be interested in writing some simple games like 4d pong, n-d maze, or n-d minesweeper. I think games are a great way for people to build intuitions.
I think the first tasks here are to write up some documentation of my ideas and to explore tesorflowjs as a library to use in development.
Wish me luck : )