I do mention that one should probably work with a LISP variant at some point, at a bare minimum. Being able to think in functional programming ways is definitely important
There's functional programming, and then there's functional programming. The term is overloaded almost to the point of uselessness: it mostly just means not-imperative. But beyond that, lisps have little in common with MLs.
Lisp is python done right, and is best suited for the same sorts of domains: those where inputs are messy and everyone is ok with that, runtime errors are easy to surface and cheap to fix, and mostly-right means mostly-as-good.
MLs are theorem provers done wrong done right, and lend themselves to a pretty much orthogonal set of problems: inputs are either well-structured or dangerous enough that they need to have a structure imposed on them, runtime errors can be subtle and/or extremely bad, and mostly-right means wrong.
if you're a logic freak it trains you to be comfortable with the insufficiency of symbols, memes and jokes of mathematicians and physicists disagreeing about notation reveals an underlying controversy about the question "what are symbols actually for?"
I don't think this is an accurate description of the cultural difference between physicists and mathematicians. Tiny respective minorities of die-hard instrumentalists and formalists aside, both fields agree that the symbols are just tools for talking about the relevant objects of study more clearly and concisely than natural language permits. Plenty of published math research is formally incorrect in an extremely strong sense, but no one cares as long as all the errors can be corrected trivially. In fact, an important aspect of what's often called "mathematical maturity" is the ability to make those corrections automatically and mostly unconsciously, instead of either falling into genuine sloppiness or getting hung up on every little "the the".
The real core difference is the obvious one. To zeroth order: physicists study physics, and mathematicians study math. To first order: physicists characterize phenomena which definitely exist, mathematicians characterize structures which may or may not.
The universe definitely exists, it definitely has a structure, and any method which reliably makes correct predictions reflects a genuine aspect of that structure, whatever it might be. Put another way: physicists have an oracle for consistency. Mathematicians don't have that option, because structures are the things they study. That's what makes them mathematicians, and not physicists. They can retreat to higher and higher orders, and study classes of theories of logics for ..., but the regress has to stop somewhere, and the place it stops has to stand on its own, because there's no outside model to bear witness to its consistency.
If all known calculations of the electron mass rely on some nonsensical step like "let d = 4 - epsilon where d is the dimensionality of spacetime", then this just means we haven't found the right structure yet. The electron mass is what it is, and the calculation is accurate or it isn't. But if all known "proofs" of a result rely on a nonsensical lemma, then it is a live possibility that the result is false. Physics would look very different if physicists had to worry about whether or not there was such a thing as mass. Math would look very different if mathematicians had a halting oracle.
I expect to take several more late-undergraduate- to early-graduate-level math courses. Presumably some will turn out to be much more valuable to me than others, and presumably this is possible to predict better-than-randomly in advance.
Yes, but not in a uniform way. The mathematical frontier is so large, and semesters so short, that Professor A's version of, for instance, a grad level "Dynamical Systems" course can have literally no overlap with Professor B's version. Useful advice here is going to have to come from Professors A and B (though not necessarily directly).
what the math of agency ... is like
Underdeveloped. There's some interesting work coming out of the programming language theory / applied category theory region these days (Neil Ghani and David Spivak come to mind), but "the math of agency" is not even an identifiable field yet, let alone one mature enough to show up in curricula.
This might be useful in relatively young fields (e.g. information theory) , or ones where the topic itself thwarts any serious tower-building (e.g. archaeology), but in general I think this is likely to be misleading. Important problems often require substantial background to recognize as important, or even as problems.
A list of applications of sophisticated math to physics which is aimed at laymen is going to end up looking a lot like a list of applications of sophisticated math to astrophysics, even though condensed matter is an order of magnitude larger, and larger for sensible reasons. Understanding topological insulators (which could plausibly lead to, among other things, practical quantum computers) is more important than understanding fast radio bursts (which are almost certainly not alien transmissions). But a "fast radio burst" is literally just a burst of radio waves that appeared and disappeared really fast. A neutron star is a star made of neutrons. An exoplanet is a planet that's exo. A topological insulator is ... an insulator that's topological? Well, A: no, not really, and B: now explain what topological means.
Once people can ask well-posed questions, providing further motivation is easy. It's getting them to that point that's hard.
Here's another approach, using tools that might be more familiar to the modal reader here:
This sort of "forwards on space, backwards on stuff" structure shows up pretty much anywhere you have bundles, but I often find it helpful to focus on the case of polynomial functors - otherwise known as rooted trees, or algebraic data types. These are "polynomials" because they're generated from "constants" and "monomials" by "addition" and "multiplication".
Viewed as trees:
A * x^0
A * x
(F + G) x
(F * G) x
Viewed as data types:
(F x, G x)
The tree view gets messy quick once you start thinking about maps, so we'll just think about these as types from now on. In the type view, a map between two polynomial functors f and g is a function ∀ x . f x -> g x. Well, almost. There are some subtleties here:
∀ x . f x -> g x
f x -> g x
But we're just building intuition, so parametric polymorphism will do. So how do you map one tree to another without knowing anything about the structure of the node labels? In the case where there aren't any labels (equivalently, where the nodes are labelled with values of (), the type with one value), you just have an ordinary function space: f () -> g (). And since this determines the shape of the tree (if you're hesitating here, working out the induction is a good exercise), all that's left is figuring out how to fill in the node labels.
space: f () -> g ()
How do you fill a g shaped container with arbitrary x values, given only an f container filled with x values? Well, we don't know how to produce x values, we don't know how to combine x values, and we don't know to modify x values, so the only thing we can do is take the ones we already have and put them in new slots. So for every slot in our output g () that we need to fill with an x, we pick a slot in each input to space that can produce it, and commit to using the x it contains. This is the backwards part of the map.
There is a deep connection between Chu spaces and polynomial functors by way of lenses, but I don't fully understand it and would rather not risk saying something misleading. There is also a superficial connection by way of the fact that Chu spaces and polynomial functors are both basically collections of labelled points put together in a coherent way, which I hope is enough to transfer this intuition between the two.