Press said the US president at one point liked to receive his secret briefings on an ipad. I met the person who ran the contract to take ipads, and modify them so they could be used in this role. I don't remember exactly the number he told me, but it was pretty shocking, I can't remember if it was 10k+, 100k+, or 1m+ though. The work was interesting and while there was a profit margin, my understanding is that it wasn't just graft, they actually had to do a lot of hardware and software work.
I pretty much strongly agree with this sentiment:
"Our results are evidence that future AI failures may look more like industrial accidents than coherent pursuit of goals that were not trained for. "
I have agreed for years, so maybe it's my bias talking. I think control theory based approaches (STAMP, STPA) will be able to mitigate these risks.
Just going to focus on the first one to get my opinion down: Will Automating AI R&D not work for some reason, or will it not lead to vastly superhuman superintelligence within 2 years of "~100% automation" for some reason?
I think that there are diminishing returns on 'intelligence', and that while something with a testable IQ that 'maxes out' any available test may well come along in the next few years, possibly by surprise, the net effect, while transformative, will not be an intelligence explosion that destroys the planet and human race with its' brilliance.
I think there's a bit of a conceit that 'with enough smarts, someone doesn't have to be bossed around by idiots', when in practice this seems to rarely happen.
How many people pay for the google extreme AI package, and how helpful has it been in terms of amassing resources and professional advancement?
A cursory examination of the vuln and fix for CVE-2025-9231 shows a few things. The bug was introduced by a Huawei engineer—or at least, a git user with a Huawei email address. The "speed optimization" that introduced the bug was "like 10%" of the code in the libcrypto file. The fix was basically to blow away the custom code.
"Midpoint passive key recovery and decryption" is generally considered to be the holy grail of a SIGINT capability. The placement by a Huawei engineer in a speed optimization provides an additional layer of cover; was it an honest mistake, did he do it on behalf of his employer, or was he on someone else's payroll?
If you're very scared of surveillance and are actually serious about avoiding it, you might run ARM hardware (to avoid Management Engine and other Intel/AMD shenanigans in the processor), and you might opt to use FOSS software (auditable by you, and audited by many competent people), with algorithms that were not developed in the US (like SM2). If someone worth attacking did exactly this, this would be a plausible way to turn passive observation of encrypted traffic into high quality SIGINT.
Interestingly enough, this may also parallel the "cryptographically relevant quantum computer" fears. If someone was collecting information at the midpoint on vulnerable systems but did not know about this bug, there might be a way for them to leverage it to decrypt stored communications. (If you can pull session keys from passive, you can decrypt sessions—though there might be issues imposed by PFS, you may not always get the whole thing, and network jitter might make passive timing attacks hard. But if this was a nation state, they would have aggressively tested this, and it would be relevant to whatever sensor they possess and whatever compute they could throw at the task.)
We don't know if anyone important used that configuration, we don't know if anyone was listening, and we don't know if that engineer just got overzealous with optimization.
I hope you guys thought about the possibility of burning some nation state's presumably very expensive and long-term operation (with the attendant interest you will generate) before you posted—and chose to go forward anyway.
I think I've talked about Nancy Leveson's STAMP, STPA, and CAST frameworks for using control theory to prevent industrial accidents here. I think it's relevant to AI safety, you don't necessarily need to overspecify every little thing the system does, you just need to carefully specify the unwanted outcomes, and the states of the system where that outcome is possible due to things outside the control of the system.
Eg: 'my thing can't be allowed to get hit by lightning, so if the system's state is 'outside during a thunderstorm', we consider that as something the system should have been engineered to prevent'.
I think my comment has aged well: https://arxiv.org/abs/2512.09643
"The number of objects in orbit is rapidly increasing, primarily driven by the launch of megaconstellations, an approach to satellite constellation design that involves large numbers of satellites paired with their rapid launch and disposal. While satellites provide many benefits to society, their use comes with challenges, including the growth of space debris, collisions, ground casualty risks, optical and radio-spectrum pollution, and the alteration of Earth's upper atmosphere through rocket emissions and reentry ablation. There is substantial potential for current or planned actions in orbit to cause serious degradation of the orbital environment or lead to catastrophic outcomes, highlighting the urgent need to find better ways to quantify stress on the orbital environment. Here we propose a new metric, the CRASH Clock, that measures such stress in terms of the time it takes for a catastrophic collision to occur if there are no collision avoidance manoeuvres or there is a severe loss in situational awareness. Our calculations show the CRASH Clock is currently 2.8 days, which suggests there is now little time to recover from a wide-spread disruptive event, such as a solar storm. This is in stark contrast to the pre-megaconstellation era: in 2018, the CRASH Clock was 121 days."
I was referencing a previous post I made about harms, I think it's good to quantify danger in logs (ones, tens, hundreds, thousands): https://www.lesswrong.com/posts/Ek7M3xGAoXDdQkPZQ/terrorism-tylenol-and-dangerous-information#a58t3m6bsxDZTL8DG Three logs means 'a person who implemented this could kill 1-9x10^3 people'. I don't think the current censorship approach will work for issues like this, because it's something the censors are likely unaware of, and therefore, the rules are not tuned to detect the problem. The models seem to have crossed a threshold where they can actually generate a new idea.
Thanks for sending this around!
Something changed with the most recent generation of models. I have a few 'evil tech' examples that are openly published, but the implications have not been. So when I get a new model, I throw some papers in and ask 'what are the implications of this paper for X issue', the newest generation is happy to explain the misuse case. This is particularly dangerous because in some cases, a bad actor making use of this 'evil tech' would be doing things that 'the good guys' do not understand to be possible. I do think I could hit three logs with implementation of one of the schemes; up to now the models were not smart enough to explain it.
If anyone reading this works at a major lab (preferably Google), you might want to talk to me.
Claude assisted?
Many people (self included) have the experience of doing manual labor, standing next to an industrial machine that could move the dirt that sits idle because their hands and backs are cheaper than gasoline.