" (...) the term technical is a red flag for me, as it is many times used not for the routine business of implementing ideas but for the parts, ideas and all, which are just hard to understand and many times contain the main novelties."
- Saharon Shelah
"A little learning is a dangerous thing ;
Drink deep, or taste not the Pierian spring" - Alexander Pope
As a true-born Dutchman I endorse .
For my most of my writing see my short-forms (new shortform, old shortform)
Twitter: @FellowHominid
Personal website: https://sites.google.com/view/afdago/home
Nice post Cole.
I think I'm sympathetic to your overall point. That said, I am less pessimistic than you that Neural Network computation can never be understood beyond the macroscopic level ' what does it do' .
The Turing machine paradigm is just one out of many paradigms to understand computation. It would be a mistake to be too pessimistic based on just the failure of the ur-classical TM paradigm.
Computational learning theory's bounds are vacuous for realistic machine learning. I would guess, and I say this as a nonexpert, that this is chiefly due to
(i) a general immaturity of the field of computational complexity, i.e. most of the field is conjectures, it's hard to prove much about time complexity even if we're quite confident the statements are likely true
(ii) computational learning theory grew out of classical learning theory and has not fully incorporated the lessons of singular learning theory. Much of the field is working in the wrong 'worst-case/pessimistic' framework when they should be thinking in terms of Bayesian inference & simplicity/degeneracy bias. Additionally, there is perhaps too much focus on exact discrete bounds when instead one should be thinking in terms of smooth relaxation and geometry of loss landscapes.
That said, I agree with you that the big questions are currently largely open.
https://youtu.be/tgkP0W7OvMc?si=hoa0l2mu5B6aRbpy
Perhaps of interest, 16:33 the guy mentions the development of a new type of drone resistant "turtle" tank
Highly recommended video on drone development in the Ukraine-Russia war, interview with a Russian private military drone developer.
some key takeaways
Chinese manufacturing and technological prowess is strong but they produce their military material like toys - they don't have the on-hands experience that Russia [and Ukraine] has.-
Much appreciated Habryka-san!
You might be interested in my old shortform on the military balance of power between US and China too. It's a bit dated by now - the importance of drones has become much more clear by now [I think the evidence that we are in a military technological revolution on par with the introduction of guns] but you may find it of interest regardless.
According to Ukraine drone operators western drones are often not even regarded as very good. Expensive, overengineered, fail often, haven't kept pace with rapid innovation during the Ukraine war.
Note also in PPP China is already 50% ahead of the US
The US military has about 10k drones of all sizes. Ukraine alone builds 2-4 million drones a year, mostly smaller. Most of the production involves assembling chinese made components. China has something like ninety percent of the global market share for components of small drones.
There is not a single NATO country currently thathat is building drones at scale.
Circling back to this. I'm interested in your thoughts.
I think the Algorithmic Statistics framework [including the K-structure function] is a good fit for what you want here in 2.
to recall the central idea is that any object is ultimately just a binary string that we encode through a two-part code: a code encoding a finite set of strings such that with a pointer to
within .
For example could encode a dataset while would encode the typical data strings for a given model probability distribution in a set of hypotheses for some small . This is a way to talk completely deterministically about (probabilistic model), e.g. like a LLM trained in a transformer architecture.
This framework is flexible enough to describe two codes encoding such that
and as required. One can e.g. easily find simple examples of this using mixtures of gaussians.
I'd be curious what you think!
I was taught the more classical 'ideal' point of view on the structure of rings in school. I'm curious if [and why] you regard the annihilator point of view as possibly more fecund?