I like this perspective! The idea of formalization as suspension of intuition reminds me of the story of the "Gruppenpest" in the development of quantum mechanics. The abstraction of groups (as well as representations and matrices) was seen by many as non-physical and unintuitive. But it turned out the resulting abstractions of gauge theories and symmetries were more fundamental objects than their predecessors.[1][2][3][4]
It also reminds me of a view I've been told many times that mathematical formalization/modeling is the process of forgetting details about a problem until its essential character is laid bare. I think it's important to emphasize that formalization is only a partial suspension or redirection of intuition (which seems to be what Bachelard is actually implying), since the goal of formalization typically isn't to turn the process into something doable by a mechanical proof checker. Formalization removes all the "distractions" that block you from seeing underlying regularities, but you still need to leverage your intuition to get a grasp on those regularities. As you say in the post:
What this suspension gives us is a place to explore the underlying relationships and properties without the tyranny of immediate experience. Thus delivered from the “obvious”, we can unearth new patterns and structures that in turn alter our intuitions themselves!
https://www.researchgate.net/publication/234207946_From_the_Rise_of_the_Group_Concept_to_the_Stormy_Onset_of_Group_Theory_in_the_New_Quantum_Mechanics_A_saga_of_the_invariant_characterization_of_physical_objects_events_and_theories
https://ncatlab.org/nlab/show/Gruppenpest
https://hsm.stackexchange.com/questions/170/how-did-group-theory-enter-quantum-mechanics
https://www.math.columbia.edu/~woit/wordpress/?p=191
Thanks for the catch!
This reminded me of how GPT-2-small uses a cosine/sine spiral for its learned positional embeddings embeddings, and I don't think I've seen a mechanistic/dynamical explanation for this (just the post-hoc explanation that attention can use cosine similarity to encode distance in R^n, not that it should happen this way).