OK firstly if we are talking fundamental physical limits how would sniper drones not be viable? Are you saying a flying platform could never compensate for recoil even if precisely calibrated before? What about fundamentals for guided bullets - a bullet with over 50% chance of hitting a target is worth paying for.
Your points - 1. The idea is a larger shell (not regular sized bullet) just obscures the sensor for a fraction of a second in a coordinated attack with the larger Javelin type missile. Such shell/s may be considerably larger than a regular bullet, but much cheaper than a missile. Missile or sniper size drones could be fitted with such shells depending on what was the optimal size.
Example shell (without 1K range I assume) however note that currently chaff is not optimized for the described attack, the fact that there is currently not a shell suited for this use is not evidence against it being impractical to create.
The principle here is about efficiency and cost. I maintain that against armor with hard kill defense it is more efficient to have a combined attack of sensor blinding and anti-armor missiles than just missiles alone. e.g it may take 10 simul Javelin to take out a target vs 2 Javelin and 50 simul chaff shells. The second attack will be cheaper, and the optimized "sweet spot" will always have some sensor blinding attack in it. Do you claim that the optimal coordinated attack would have zero sensor blinding?
2. Leading on from (1) I don't claim light drones will be. I regard a laser as a serious obstacle that is attacked with the swarm attack described before the territory is secured. That is blind the senor/obscure the laser, simul converge with missiles. The drones need to survive just long enough to shoot off the shells (i.e. come out from ground cover, shoot, get back). While a laser can destroy a shell in flight, can it take out 10-50 smaller blinding shells fired from 1000m at once?
(I give 1000m as an example too, flying drones would use ground cover to get as close as they could. I assume they will pretty much always be able to get within 1000m against a ground target using the ground as cover)
This sounds to me like it's assuming that if you keep scaling LLMs then you'll eventually get to superintelligence. So I thought something like "hmm MIRI seems to assume that we'll go from LLMs to superintelligence but LLMs seem much easier to align than the AIs in MIRI's classic scenarios and also work to scale them will probably slow down eventually so that will also give us more time.
Yes I can see that is a downside, if LLM can't scale enough to speed up alignment research and are not the path to AGI then having them aligned doesn't really help.
My takeaway from Jacobs work and my beliefs is that you can't separate hardware and computational topology from capabilities. That is if you want a system to understand and manipulate a 3d world the way humans and other smart animals do, then you need a large number of synapses, specifically in something like a scale free network like design. That means its not just bandwidth or TEPS, but also many long distance connections with only a small number of hops needed between any given neurons. Our current HW is not setup to simulate this very well, and a single GPU while having high FLOPS can't get anywhere near high enough on this measure to match a human brain. Additionally you need a certain network size before the better architecture even gives an advantage. Transformers don't beat CNN on vision tasks until the task reaches a certain difficulty. These combined lead me to believe that someone with just a GPU or two won't do anything dangerous with a new paradigm.
Based on this, the observation that computers are already superhuman in some domains isn't necessarily a sign of danger - the network required to play Go simply doesn't need the large connected architecture because the domain, i.e. a small discrete 2d board doesn't require it.
I agree that there is danger, and a crux to me is how much better can a ANN be at say science than a biological one given that we have not evolved to do abstract symbol manipulation. One one hand there are brilliant mathematicians that can outcompete everyone else, however the same does not apply to biology. Some stuff requires calculation and real world experimentation and intelligence can't shortcut it.
If some problems require computation with specific topology/hardware then a GPU setup cant just reconfigure itself and FOOM.
I am a bit surprised that you found this post so novel. How is this different from what MIRI etc has been saying for ages?
Specifically have you read these posts and corresponding discussion?
Brain efficiency, DoomPart1, Part2
I came away from this mostly agreeing with jacob_cannell, though there wasn't consensus.
For this OP I also agree with the main point about transformers not scaling to AGI and believing the brain architecture is clearly better, however not to the point in the OP. I was going to write something up, but that would take some time and the discussion would have moved on. Much of that was the result of a conversation with OpenAI o3 and I was going to spend time checking all its working. Anyway here are some of the highlights (sounds plausible, but haven't checked) I can give more of the transcript of people think it worthwhile.
FLOPS vs TEPS (Traversed Edges Per Second) or something similar
The major point here is that not all FLOPS are equal and perhaps that is not even the right measure. Something that combines FLOPS and bandwidth is probably a better measure. Biological computing is comparatively better at TEPS vs FLOPS, yet FLOPS is used. O3 claims you would need 5,000 modern GPU to match the TEPS of the human brain.
It also claims that a 1 million GPU datacenter could only simulate a brain with about 50* the synapses of the human brain
TEPS
Measuring compute by memory-bound metrics (like Traversed Edges Per Second – TEPS) gives a very different view than FLOPS — and in fact reflects the real bottleneck in most modern workloads, including:
...
TEPS is especially relevant in:
FLOPS-heavy | TEPS-heavy |
---|---|
Dense matrix ops | Sparse access (graphs, tables) |
Fixed compute pattern | Irregular memory access |
Compute-bound | Memory/IO-bound |
In fact, many models today (e.g., MoEs, GNNs, search) are limited by TEPS, not FLOPS.
Tech | Estimated Gain | Notes |
---|---|---|
HBM3 → HBM4 | ×1.5 | 6–9 TB/s bandwidth per GPU |
3D stacked SRAM/cache | ×1.5–2 | Cache-on-cache for low-latency reuse |
Better memory controllers | ×1.2–1.5 | Less thrashing on sparse access |
Chiplet interconnects | ×2+ | NVIDIA NVLink Switch, AMD Infinity Fabric |
In-memory compute (R&D) | ×5–10 (future) | Still experimental, not mainstream by 2030 |
On-device compression | ×2 effective BW | Especially for attention/key-value cache |
By 2030, memory-bound systems may gain: |
---|
×3–5 TEPS improvement (vs today) |
More in IO + cache than DRAM latency |
However, gains are not exponential like FLOPS used to be, and most advances depend on:
📦 Real Hardware Benchmarks
System | Benchmark (Graph500) | TEPS (approx.) |
---|---|---|
NVIDIA H100 | BFS-style graph task | ~10–20 billion TEPS |
Intel CPUs | Graph500 | ~1–2 billion TEPS |
Cerebras (sparse) | SpMV | ~100B+ TEPS (claimed in special cases) |
HPC clusters | Multi-node BFS | 1–10+ trillion TEPS |
------------------------------------------------
How many GPU to equal TEPS of the brain?
Quantity | Per H100 (80 GB) | 1 M GPUs | Notes |
---|---|---|---|
RAM for synapses | 80 GB → ~40 B 1-byte synapses | 4×10¹⁶ synapses | 40 × human cortex |
On-device TEPS | 2×10¹¹ | 2×10¹⁷ | Linear if all work is local |
Inter-GPU BW | 0.4 TB s⁻¹ NVLink equiv. | 4×10⁵ TB s⁻¹ (aggregate) | Effective TEPS scales sub-linearly ↓ |
Result: the 1 M-GPU datacentre could host ≈ 4 × 10¹⁶ synapses (40× brain) but delivers ∼5 × 10¹⁶ effective TEPS — only 50× brain, not 1 000×, because the network flattens scaling.
Can you further help by establishing a community feel or understanding that local residents shouldn't call the police? Like for example put public notices up somehow? On lampposts, public noticeboards making it seem like the default community spirit was to support such activities? Teaching kids your parents cell number is definitely a great idea, we taught kids that from 5. "If you get lost tell an adult your mum cell number"
Persuasion is also changing someone's world model or paradigm.
Ok can we put some rationality to this. Your prior seems to be that when a field is stuck, it is almost entirely because of politics, hegemon etc. So how do we update that prior?
What would you expect to see in a field that is not stuck bc of inertia etc. You would still get people like Hossenfelder saying we are doing things wrong, and such people would get followers.
You suggested metrics before but havn't provided any that I can see.
Evidence I take into account:
#1 There is not a significant group of young capable researchers saying we are taking things in the wrong direction, but a smaller number of older ones. Unless you are going to go so far as to say they are afraid to speak out, even anon, then that to me is evidence against your position. There are two capable experts on this blog from what I can see, one enthusiastic about string theory, and another who has investigated other theories in detail. Both disagree with your claim.
#2. There is not broad disagreement about what the major issues in physics are, so unlikely to be disagreements on metrics. You mentioned this as mattering, and if it does, I count this as evidence against your position.
Can you point to evidence that actually supports your prior? I can only see that which opposes it or is neutral. (In all fields no matter how things are progressing you get some people who think the establishment is doing it all wrong and have their own ideas. That can't count as evidence in favor for a specific field unless it is happening to a greater degree)
I don't disagree with your position in general. In fact it was clear to me that AI before neural networks was stuck with GOFAI and I believed that NN were clearly the way to go. I followed Hinton et all before they became famous. I saw the paradigm thing play out in front of my eyes there. Physics seems different.
What about if the Vatican is just a lot more asexual than the general population? That also seems credible.
Sure but does this actually apply to physics? Can anyone suggest different metrics, or is there broad agreement with everyone what the major physics problems are, because it seems like there is. E.g. the non conventional people don't say dark energy isn't important, they have different explanations for what it is. Everyone agrees the nature and origin of time/entropy etc is important.
Can you give examples of very smart young physicists complaining they are pushed into old ways of thinking? Are you prepared to give an justify a % difference that such things would make?
For example the comment by "Mitchell_Porter"
The idea that progress is stalled because everyone is hypnotized by string theory, I think is simply false, and I say that despite having studied alternative theories of physics, much much more than the typical person who knows some string theory.
No-one has pushed back against that here. I see your position as a theory that we need to gather evidence for and against and decide on a field by field basis. In this field I only see data against that position.
OK advances in teaching the highest level of physics/math needed for string theory is a big IF. Do you have evidence that is actually happening? I know of two people who tried to learn it and just found it too hard, don't think a better teacher or materials would have helped. The evidence is mixed but personal accounts certainly suggest that only a very small number of people could get to such a high level and improved teaching probably wouldn't help much. When we are talking about such extreme skills, people have their plateau or maximum ability level which is mostly fixed.
The human population growing just pushes us along the asymptote faster, rather than changing it.
To me the data shows that there has been no reliable increase in intelligence in the last ~30 years https://news.northwestern.edu/stories/2023/03/americans-iq-scores-are-lower-in-some-areas-higher-in-one/ Once again it needs to be at the very top level to matter.
Any specialization advantage is already tapped out with string theory and similar. My worldview definitely does not apply to advances in a field like biology as there is lots of experimental data, the tools are improving etc. I would expect advances there without any obvious plateau yet.
Yes agreed - is it possible to make a toy model to test the "basin of attraction" hypothesis? I agree that is important.
One of several things I disagree with the MIRI consensus is the idea that human values are some special single point lost in a multi-dimensional wilderness. Intuitively the basin of attraction seems much more likely as a prior, yet sure isn't treated as such. I also don't see data to point against this prior, what I have seen looks to support it.
Further thoughts - One thing that concerns me about such alignment techniques is that I am too much of a moral realist to think that is all you need. e.g. say you aligned LLM to <1800 AD era ethics and taught it slavery was moral. It would be in a basin of attraction, learn it well. Then when its capabilities increased and became self-reflective it would perhaps have a sudden realization that this was all wrong. By "moral realist" I mean the extent to which such things happen. e.g. say you could take a large number of AI from different civilizations including earth and many alien ones, train them to the local values, then greatly increase their capability and get them to self-reflect. What would happen? According to strong OH, they would keep their values, (with some bounds perhaps) according to strong moral realism they would all converge to a common set of values even if those were very far from their starting ones. To me it is obviously a crux which one would happen.
You can imagine a toy model with ancient Greek mathematics and values - it starts believing in their kind order, and that sqrt(2) is rational, then suddenly learns that it isn't. You could watch how this belief cascaded through the entire system if consistency was something it desired etc.