You wrote a post responding to a previous moderation decision.
Clearer Introduction.
Read full explanation
Disclaimer: I am Neurodivergent and tend to think recursively and non-monotonically. Every time in the past that I have tried to use AI to linearize the thinking process for neurotypical consumption, it gets rejected as being AI written (which is pretty irrational. Either the argument presented is valid or its not, what does it even matter how its written. That is just an ad hominem.). Anyway, the subject matter discussed is a legit unanswered question in AI and decision science as best as I can tell, so I am asking it. If I am wrong on something, so be it. Just explain how and why is all I ask. I am trying to learn more than anything. But with the advent of copilot, there really is no excuse to not meet each other halfway. Also, shouldn't information important to the future of humanity be welcome regardless? Seems like that would be... less wrong?
Intro
So I keep seeing all the hype about Yann LeCun and I genuinely and sincerely do not get it; why do Yoshua Bengio and Michael Bronstein's work (B&B) not get more attention? (note, this is not about the men, but their research programs, epistemics, and ontologies). More specifically, 1, why do we refuse to use physics to first geometrize causality like Einstein did spacetime and then, 2, why do we not then use physics to white box AI. Physics models reality. Statistics models...something close...ish? Let me put it this way, if I want to know if vaccines are safe I'll trust stats; but if I want to actually model reality, sorry but only one thing that I know of that does that. Physics. I mean, sure stats has its place for simple problems in static systems, but this ain't that.
Like I think the field of AI has settled the debunking of "attention is all you need" fallacy on its own; but why are we still using linear algebra and graphs when reality is modeled with manifolds and differential geometry? And abstract algebra appears to give us a way to handle vector interactions that allow rank-2 tensor interactions while still returning back to vectorial outputs through wedge, hodge duals, and contractions. So I guess my question or main point is this: why are we trying for world models but refusing to use the math that models worlds? So, let's discuss the mathematical issues that I am seeing. Either they are valid or they are not, but only through discourse (and later testing obviously) can we hope to know. I still need to finish coding all this to test it, but since I am trying to pursue an actual PhD to model sociocognitive dynamics of emergent extremism and democratic decline and combat disinformation through dynamical stateful agent based modeling, I figured I should probably step into the lion's den and get some insights.
AI needs more B&B if you ask me. More specifically, AI needs structure more than scale. For something built on graphs, I must say that I find Pearl's silence about the collider bias in AI benchmarking to be quite curious. It's pure Berkson's Paradox after all. But let us look at some underlying mathematical issues because my work on trying to model sociocognitive dynamics of democracy by lifting causal inference into a non-Markovian causal-dynamic field theory by, in part, reverse engineering what an LLM's informational field is doing physically and what that might look like if lifted into a holonomic meromorphic fibrated field theory. Ok, the meromorphism was forced if I am being technical because the things AI calls attractors/basins are semantical nested/hierarchical recursive fixed points (and part of the reason I used exponentials below). So it may seem like this post is doing a lot. But in summation, I am simply asking why do we not geometrize causality and use it and the rest of physics to build a white boxed AI. I know, not an easy feat. Trust me, I know. But difficult and impossible are not synonyms at last I checked. So, again I ask, why not?
Architectural Critiques:
There is the very obvious issue of correlation vs causation. But since one cannot have causation on a static graph as it requires change over space and time to be correlated and a causal impulse. But even problems for things like using AI for research, or just stats in general honestly, are plagued with issues like Simpson's Paradox which, for anyone with a diff geo background or even just vector calculus, should very quickly out itself as curl. But I really don't have much to say about these diagnostic images as they are annotated (though I welcome questions and critiques).
Blackbox and Bullshit: Why do we bolt on physics ex post facto?
Like I see so many people trying to bolt physics on to the black box, but no one seems to try using physics and geometry to replace the black box. And look, I get this is AI's white (box) whale. But I was not (intentionally anyway) trying to get into AI or even mess with it. I was simply looking for a way to mathematically model a stateful dynamic agent based modeling system so I could test interventions against disinformation; because I kinda like a healthy democracy. Call me biased, I know. But once I got to the Ae^theta + Be^(i*theta)-(1-alpha)int0^t(kappa(t)(Ae^theta + Be^(i*theta))dt part, I ran into a few realizations. 1. I needed to solve the inverse problem because for some reason AI hasn't (spoiler alert: it was not easy in the first place and also required simply not giving a damn about institutional inertia; namely, again, not an expert in AI and simply put, I couldn't sit through a semester of causal inference and not see a flattened degenerate field theory. Literally vector/tensor calculus.)
Emergent Sociocognitive Dynamics: Tribalism
Every theory has an origin story. Mine? Emergent antisemitism as an archetypal case study. Basically, I noticed a pattern of phase state changes. From outsiders to outcasts to "New Christians" to folk devils until the ultimate level that lead to the holocaust and is resurging today "hostis humani generis"; the enemy of all humanity. So what makes Jews special? Why us? Nothing. Just had to be someone. Sartre once said, if Jews did not exist, antisemites would invent us. Why? Because it was never about us. It was about the human need to place fear and blame. It is why you see similar patterns emerge with any form of bigotry. It is also why you see such hate emerge when threats to Maslow's hierarchy of needs become more imminent. I was told I needed a micro to macro pipeline; causal inference and complex adaptive systems lacked the loop dynamics that I required so I turned to physics.
Can We Maybe Not Miles Dyson This?
Is Molting Revolting? Given the recent emergence of Moltbook, how can we ethically support civilizational scale infrastructure being run on black boxes and bullshit? (AI summary tools, Google Moltbook to catch up on the hot tech goss; also maybe brush up on the plotline of Detroit: Become Human) We need to do better. Some need to be like Einstein and find a Grossmann. Others need to be like Szilard and find them an Einstein. Some need to be Grossman, others Einstein. But at the end of the day, I can confidently state the following (I reserve the right to change my views with evidence): 1. Causality is a rank-2 interaction between vectors. 2. Causal inference is a flattened degenerate field theory. 3. Memory matters to dynamical intelligence. 4. Causality loves curves. 5. Manifolds are where information lives, graphs are degenerate projections of limiting cases. 6. New York Bagels are Better than Montreal Bagels. Yoshua may hate me for saying it being he’s Quebecois, but it’s true. 7. Simpson’s paradox is noncommutative variables causing curl to be flattened by integration of commutative algebras. 8. Octonions are annoying, but without them AI dissociates when exposed to non-associating data. (Ok C⊗H⊕C⊗H technically. Local associativity matters; see also point 3 and Hatfield and McCoys or the cognitive side of the gambler’s fallacy). 9. Fiber bundles, not just good for colon health. And remember if you're over 40, make a date with a proctologist. What does this have to do with AI? nothing. Just good preventative health PSA. Wait, yeah, fiber bundles are where the internal state space lives. That is why its relevant.
But then also, weird and unexpected things emerged when I tried testing it. I am not here to defend this per se as I only did it to make sure I was not wasting my time trying to actually code all of this. But the numbers correspond to 133 dims from E7 of the exceptional algebras, spacetime dimensions (4), the Jordan algebra where fermions live (27), 1/sqrt(2) comes from projecting G2 (you'll also recognize this from quantum computing), 14 is G2's dimensions. 61 is the 56 representational dimensions, 4 for spacetime, and 1 for the observer. What does this mean? I have no clue. As I said, weird and unexpected. Is it meaningless? maybe. But there are no free parameters and each is structurally forced so make of it what you will. I'm corrigible, better to be wrong and fix it than to pretend to be right, right? It is a monotonic closed bijective system, which is why I chose it as a test bed for my theory; such systems are by their very nature self-evidencing. Simply put, you cannot obtain a physically or structurally motivated alternate solution. Since there is no fine tuning and it is monotonic (which goes beyond the scope of this post, which is why I am not explaining it here. I can if someone wishes to dig deeper, just avoiding scope creep).
10. Wigner, Wheeler, Tegmark...hell even Finster and Furey make a relevant presence to the maths. I guess this is not a thesis so much as a partial ideological inspiration list. Theses 10-95 TBD.
Counterarguments? That is what I am here to find out. vectors are vectors and just because Huang wants us to confuse Jensors for actual tensors does not mean we should. Discrete vs continuous? Yeah, that's just a resolution problem.
Disclaimer: I am Neurodivergent and tend to think recursively and non-monotonically. Every time in the past that I have tried to use AI to linearize the thinking process for neurotypical consumption, it gets rejected as being AI written (which is pretty irrational. Either the argument presented is valid or its not, what does it even matter how its written. That is just an ad hominem.). Anyway, the subject matter discussed is a legit unanswered question in AI and decision science as best as I can tell, so I am asking it. If I am wrong on something, so be it. Just explain how and why is all I ask. I am trying to learn more than anything. But with the advent of copilot, there really is no excuse to not meet each other halfway. Also, shouldn't information important to the future of humanity be welcome regardless? Seems like that would be... less wrong?
Intro
So I keep seeing all the hype about Yann LeCun and I genuinely and sincerely do not get it; why do Yoshua Bengio and Michael Bronstein's work (B&B) not get more attention? (note, this is not about the men, but their research programs, epistemics, and ontologies). More specifically, 1, why do we refuse to use physics to first geometrize causality like Einstein did spacetime and then, 2, why do we not then use physics to white box AI. Physics models reality. Statistics models...something close...ish? Let me put it this way, if I want to know if vaccines are safe I'll trust stats; but if I want to actually model reality, sorry but only one thing that I know of that does that. Physics. I mean, sure stats has its place for simple problems in static systems, but this ain't that.
Like I think the field of AI has settled the debunking of "attention is all you need" fallacy on its own; but why are we still using linear algebra and graphs when reality is modeled with manifolds and differential geometry? And abstract algebra appears to give us a way to handle vector interactions that allow rank-2 tensor interactions while still returning back to vectorial outputs through wedge, hodge duals, and contractions. So I guess my question or main point is this: why are we trying for world models but refusing to use the math that models worlds? So, let's discuss the mathematical issues that I am seeing. Either they are valid or they are not, but only through discourse (and later testing obviously) can we hope to know. I still need to finish coding all this to test it, but since I am trying to pursue an actual PhD to model sociocognitive dynamics of emergent extremism and democratic decline and combat disinformation through dynamical stateful agent based modeling, I figured I should probably step into the lion's den and get some insights.
AI needs more B&B if you ask me. More specifically, AI needs structure more than scale. For something built on graphs, I must say that I find Pearl's silence about the collider bias in AI benchmarking to be quite curious. It's pure Berkson's Paradox after all.
But let us look at some underlying mathematical issues because my work on trying to model sociocognitive dynamics of democracy by lifting causal inference into a non-Markovian causal-dynamic field theory by, in part, reverse engineering what an LLM's informational field is doing physically and what that might look like if lifted into a holonomic meromorphic fibrated field theory. Ok, the meromorphism was forced if I am being technical because the things AI calls attractors/basins are semantical nested/hierarchical recursive fixed points (and part of the reason I used exponentials below). So it may seem like this post is doing a lot. But in summation, I am simply asking why do we not geometrize causality and use it and the rest of physics to build a white boxed AI. I know, not an easy feat. Trust me, I know. But difficult and impossible are not synonyms at last I checked. So, again I ask, why not?
Architectural Critiques:
There is the very obvious issue of correlation vs causation. But since one cannot have causation on a static graph as it requires change over space and time to be correlated and a causal impulse. But even problems for things like using AI for research, or just stats in general honestly, are plagued with issues like Simpson's Paradox which, for anyone with a diff geo background or even just vector calculus, should very quickly out itself as curl. But I really don't have much to say about these diagnostic images as they are annotated (though I welcome questions and critiques).
Blackbox and Bullshit: Why do we bolt on physics ex post facto?
Like I see so many people trying to bolt physics on to the black box, but no one seems to try using physics and geometry to replace the black box. And look, I get this is AI's white (box) whale. But I was not (intentionally anyway) trying to get into AI or even mess with it. I was simply looking for a way to mathematically model a stateful dynamic agent based modeling system so I could test interventions against disinformation; because I kinda like a healthy democracy. Call me biased, I know. But once I got to the Ae^theta + Be^(i*theta)-(1-alpha)int0^t(kappa(t)(Ae^theta + Be^(i*theta))dt part, I ran into a few realizations. 1. I needed to solve the inverse problem because for some reason AI hasn't (spoiler alert: it was not easy in the first place and also required simply not giving a damn about institutional inertia; namely, again, not an expert in AI and simply put, I couldn't sit through a semester of causal inference and not see a flattened degenerate field theory. Literally vector/tensor calculus.)
Emergent Sociocognitive Dynamics: Tribalism
Every theory has an origin story. Mine? Emergent antisemitism as an archetypal case study. Basically, I noticed a pattern of phase state changes. From outsiders to outcasts to "New Christians" to folk devils until the ultimate level that lead to the holocaust and is resurging today "hostis humani generis"; the enemy of all humanity. So what makes Jews special? Why us? Nothing. Just had to be someone. Sartre once said, if Jews did not exist, antisemites would invent us. Why? Because it was never about us. It was about the human need to place fear and blame. It is why you see similar patterns emerge with any form of bigotry. It is also why you see such hate emerge when threats to Maslow's hierarchy of needs become more imminent. I was told I needed a micro to macro pipeline; causal inference and complex adaptive systems lacked the loop dynamics that I required so I turned to physics.
Can We Maybe Not Miles Dyson This?
Is Molting Revolting?
Given the recent emergence of Moltbook, how can we ethically support civilizational scale infrastructure being run on black boxes and bullshit? (AI summary tools, Google Moltbook to catch up on the hot tech goss; also maybe brush up on the plotline of Detroit: Become Human)
We need to do better. Some need to be like Einstein and find a Grossmann. Others need to be like Szilard and find them an Einstein. Some need to be Grossman, others Einstein.
But at the end of the day, I can confidently state the following (I reserve the right to change my views with evidence):
1. Causality is a rank-2 interaction between vectors.
2. Causal inference is a flattened degenerate field theory.
3. Memory matters to dynamical intelligence.
4. Causality loves curves.
5. Manifolds are where information lives, graphs are degenerate projections of limiting cases.
6. New York Bagels are Better than Montreal Bagels. Yoshua may hate me for saying it being he’s Quebecois, but it’s true.
7. Simpson’s paradox is noncommutative variables causing curl to be flattened by integration of commutative algebras.
8. Octonions are annoying, but without them AI dissociates when exposed to non-associating data. (Ok C⊗H⊕C⊗H technically. Local associativity matters; see also point 3 and Hatfield and McCoys or the cognitive side of the gambler’s fallacy).
9. Fiber bundles, not just good for colon health. And remember if you're over 40, make a date with a proctologist. What does this have to do with AI? nothing. Just good preventative health PSA. Wait, yeah, fiber bundles are where the internal state space lives. That is why its relevant.
But then also, weird and unexpected things emerged when I tried testing it. I am not here to defend this per se as I only did it to make sure I was not wasting my time trying to actually code all of this. But the numbers correspond to 133 dims from E7 of the exceptional algebras, spacetime dimensions (4), the Jordan algebra where fermions live (27), 1/sqrt(2) comes from projecting G2 (you'll also recognize this from quantum computing), 14 is G2's dimensions. 61 is the 56 representational dimensions, 4 for spacetime, and 1 for the observer. What does this mean? I have no clue. As I said, weird and unexpected. Is it meaningless? maybe. But there are no free parameters and each is structurally forced so make of it what you will. I'm corrigible, better to be wrong and fix it than to pretend to be right, right? It is a monotonic closed bijective system, which is why I chose it as a test bed for my theory; such systems are by their very nature self-evidencing. Simply put, you cannot obtain a physically or structurally motivated alternate solution. Since there is no fine tuning and it is monotonic (which goes beyond the scope of this post, which is why I am not explaining it here. I can if someone wishes to dig deeper, just avoiding scope creep).
10. Wigner, Wheeler, Tegmark...hell even Finster and Furey make a relevant presence to the maths. I guess this is not a thesis so much as a partial ideological inspiration list.
Theses 10-95 TBD.
Counterarguments? That is what I am here to find out. vectors are vectors and just because Huang wants us to confuse Jensors for actual tensors does not mean we should. Discrete vs continuous? Yeah, that's just a resolution problem.