I agree that optimization amplifies things. I also agree that a mathematical mindset is important for AI alignment. I don't, however, think that a "mathematical mindset" is the same as a "proof mindset". Rather, I think that the latter is closer to being a "programming mindset" -- or, indeed, a "security mindset". And that a "mathematical mindset" is largely missing from AI-alignment discourse at present.
Whereas others see a division between two clusters, of the form,
I, by contrast, see a hierarchical progression that looks something like:
science < programming < physics < mathematics < logic <...
where, in this context, these words have meanings along the following lines:
science: things being made of parts; decomposition
programming: things being made of moving parts; constant-velocity motion; causal networks
physics: things being made of moving spatial parts; accelerated motion, rotation, fluidity; substance
mathematics: models being made of parts; transubstantiation; metaphysics; theorization
logic: concepts being made of parts; time reversal; ontology
Of course I'm not using these words standardly here. One reason for this is that, in this discussion, no one is: we're talking about mindsets, not about sociological disciplines or even clusters of particular ideas or "results".
But the really important reason I'm not following standard usage is because I'm not trying to invoke standard concepts; instead, I'm trying to invent "the right" concepts. Consequently, I can't just use standard language, because standard language implies a model of the world different from the one that I want to use.
It is commonly believed that if you want to introduce a new concept that is similar or related (but--of course--nonidentical) to an old concept, you shouldn't use the same word for the new concept and the old, because that would be "confusing". I wish to explicitly disagree with this belief.
This view presupposes that actively shifting between models of the world is not in our repertory of mental operations. But I specifically want it to be!
In fact, I claim that this, and not proof, is what a "mathematical mindset" is really about.
For mathematics is not about proofs; it is about definitions. The essence of great mathematics is coming up with a powerful definition that results in short proofs.
What makes a definition "powerful" is that it reflects a conceptual upgrade -- as distinct from mere conceptual analysis. We're not just trying to figure out what we mean; we're trying to figure out what we should mean.
A mathematical definition is what the answer to a philosophical problem looks like. An example I particularly like is the definition of a topological space. I don't know for a fact that this is what people "really meant" when they pondered the nature of "space" during all the centuries before Felix Hausdorff came up with this definition; it doesn't matter, because the power of this definition shows that it is what they should have meant.
(And for that reason, I'm comfortable saying that it is what they "meant" -- acknowledging that this is a linguistic fiction, but using it anyway.)
Notably, mathematical definitions are often redefinitions: they take a term already in use and define it in a new way. And, notably, the new definition often bears scant resemblance to the old, let alone any “intuitive” meaning of the term -- despite presenting itself as a successor. This is not a bug. It is what philosophical progress -- theoretical progress, progress in understanding -- looks like. The relationship between the new and the old definitions is explained not in the definitions themselves, but in the relationship between the theories of which the definitions are part.
Having a “mathematical mindset” means being comfortable with words being redefined. This is because it means being comfortable with models being upgraded -- in particular, with models being related and compared to each other: the activity of theorization.
It occurs to me, now that I think about it, that the term “theorization” is not very often used in the AI-alignment and rationalist communities, compared with what one might expect given the degree of interest in epistemology. My suspicion is that this reflects an insufficiency of comfort with the idea of models (as opposed to things) being made of parts (in particular, being made of parameters), such that they are relatable to and transformable into each other.
This idea is, approximately, what I am calling “mathematical mindset”. It stands in contrast to what others are calling “mathematical mindset”, which has to do with proofs. The relationship, however, is this: both of them reflect an interest in understanding what is going on, as opposed to merely being able to describe what is going on.