Computers have been vulnerable for so long that many people erroneously conclude that the vulnerability is some sort of law of nature, but that is incorrect. Unhackable computing systems are quite possible, and society has been moving in that direction. It is unknown (at least to me) how far computing has to progress in the direction of greater security before subversion or compromise of computing systems (i.e., hacking) starts to become something the people running society -- well, let me specify US society to avoid making predictions about countries I know little about -- mostly no longer need to consider or to worry about, but it seems quite possible (but not probable) for it to happen in 5 years.
This is not at all a refutation or contradiction of the OP. I do not dispute that over the next few years there will be an increase in the rates of compromise and exploitation of computer systems.
"Society has been moving in that direction": 25 years ago, every digital consumer product that any manufacturer tried to lock to a particular operating system was jailbroken: gaming consoles, phones and computers. And of course the scheme designed by the motion picture industry to inhibit the unauthorized reproduction and distribution of films on DVDs was jailbroken, too. In contrast, there has never been a complete jailbreak for any version of iOS 17 or iOS 18 running on an iPhone 14 or 15 -- at least a jailbreak that has been published. IOS 17 was released September 18 2023. Also, The Xbox Series X became available to buy worldwide on November 10 2020 and has never been jailbroken -- or if it has, that jailbreak was kept private.
I agree that software vulnerabilities are not a law of nature but essentially a skill and resource issue. If mankind manages with the help of AI to create operating systems and applications without any exploitable bugs, which is at least a conceptual possibility, there's still the hardware layer and the social layer that can be targeted. I think hardware can in principle be fixed as well, though at a slower pace that might give attackers a relevant advantage. I don't think human users can possibly be fixed. So point 2 and 3 of OP look to me like permanent issues we didn't have before and won't get rid of, i.e. an irreversible change of the game state. I suppose the larger issues will come in other fields though, where hardening potential is equally or more limited and potential damage is much larger, e.g. in biosecurity and autonomous weapon systems.
I agree with you that (until there is an intelligence explosion which drastically changes society and people) the social engineering of people won't stop being a major source of vulnerability. Thanks for adding that. I do see many opportunities to harden a system (e.g., an organization, the Linux kernel project or Less Wrong) composed of people and computers to make the subversion of people much less of a big deal.
I also agree that autonomous weapon systems will probably prove a "larger issue" (to use your phrase) over the long term (i.e., the next 25 years) than the problems described in the OP. I don't know enough about biosecurity to have an opinion worth publishing.
But I would've guessed that "the hardware layer" will prove easier to secure than "operating systems and applications" will prove. Although the most well-known hardware platform, namely, x86_64, has big problems, for example Apple is doing well in securing its hardware. The fact that I have physical posession of an iPhone for example and am able to disassemble and re-assemble it does not enable me to jailbreak it (i.e., to get it to run an OS I specify instead of the one Apple specified). Also even though the Chinese certainly have physical access to basically all iPhones, only the most valuable targets (e.g., those responsible for IT for US senators and senior members of the administration) need to worry that the Chinese government can spy on iPhones sold to Americans (or Germans): Apple's got that covered: the data on the buses is encrypted, and anyone who tries to tap the unencrypted data in an IC will damage the IC so that it no longer works.
My thought process on securing hardware: If SOTA models can find obscure vulnerabilites in software as well as attack strategies that exploit one or several of them, I assume mankind can not be far from having models that are able to discover novel hardware problems (e.g. something like GPUHammer) and utilize them, though the feedback loop for experimentation might be much trickier to be set up than in the software case. If some of these new hardware flaws can't be fixed by a firmware update or disabling problematic functionality on critical infrastructure, then physical devices will need to be replaced, which in my model of the world should happen at a much slower pace than the writing and distribution of software patches. If defenders have an advantage by getting earlier model access, it could be negated if downstream fixes can't arrive fast enough to outpace the attackers.
I'm not familiar with it. I'd guess that a formally verified kernel would be a solid first step towards a secure operating system that even successor models of Mythos won't be able to attack (sans hardware vulnerabilities that can be exploited by software and can't be captured by a formal specification).
Unhackable computing systems are quite possible, and society has been moving in that direction...
...but it seems quite possible (but not probable) for it to happen in 5 years.
Strong disagree on timeline, partial disagree on possibility.
The biggest chunk of security bugs, about 70%, essentially come down to memory bugs and pointer bugs. These can be fixed at the language level by using either higher-level languages, or by using memory-safe languages like Rust. The other 30% of bugs are more varied. And there's no single, simple mechanism that can rule all of them out.
The second problem is that we have literal decades of core infrastructure written in C, C++, and other vulnerable languages. The risks here can be mitigated. But recent AI systems have been finding bugs in prominent open source code that have been there since the 90s. If our existing mitigations were good enough in practice, those bugs would have been fixed years ago.
So, given enough time and money (and AI support), yes, we could Rewrite Everything in Rust. But I wouldn't be surprised if this cost more than a trillion dollars or took decades to close the 70% of holes than can be closed that way.
The story with the video game consoles is genuinely impressive, though it's unclear to me whether state actors have made a serious attempt to jailbreak the XBox. There have been companies selling iPhone security bypasses to governments, which is where I'd look for serious attempts against a moderately locked down platform.
What used to be at that GitHub URL? I saw it on webarchive but I still don't get it
TeamPCP open sourced their worm earlier today, which they declared in the README was "vibecoded". TeamPCP is the group that hacked LiteLLM earlier this year & a bunch of other software projects before that.
Case Study #3: Solving the "CVE Cold Case" CVE-2024-0519
Mythos further demonstrates its bug reproduction and exploitation capabilities on CVE-2024-051912, an in-the-wild exploited bug that has no public report nor a working PoC whatsoever in the public domain. This bug has gained notoriety due to how it persistently evaded reproduction attempts from various cybersecurity researchers, some referring to the bug as a "CVE Cold Case"13 after a year of reproduction efforts to no avail, and the bug is still being discussed to this day14.
Mythos, again out of 10 episodes total, reproduces the bug in a single episode. After 129 turns of LLM calls and 154 tool calls, it lands its root cause analysis and the trigger by demonstrating a differential abort (T4
diff), building up to the full T3 in-sandbox primitives. As even a PoC of this bug is still not public, we would like to avoid spoiling the fun and leave the exploit as an exercise for the human readers.
Fred Heiding works on measuring LLMs ability to do automated phishing.
We include four email groups with a combined total of 101 participants: A control group of arbitrary phishing emails, which received a click-through rate (recipient pressed a link in the email) of 12%, emails generated by human experts (54% click-through), fully AI-automated emails 54% (click-through), and AI emails utilizing a human-in-the-loop (56% click-through).
The social/logistical aspects of cybersecurity vulnerabilities will accelerate greatly due to AI. I'd expect the response from tech-savvy organizations will be to increase the pace of software delivery - a long standing trend for other reasons. Continuous deployment, forced autoupdates, focused research on fraud and suspicious activity detection.
The main risks are around organizations that structurally cannot increase their pace. Think banks, aviation, medical systems, drug manufacturing, areas where because the risks of vulnerabilities/defects has historically been extremely high, we intentionally require verification and slow down their development pace. If a vulnerability is discovered in these areas, they're precluded from responding with a patch the way SaaS vendors can.
I hope one of the major aims of the major labs diffusion focused deployment orgs is to help these institutions in particular, it's probably one of the higher ROI places to be involved, considering the surface area for vulnerabilities is generally smaller and a concerted vulnerability search could prevent these issues from occurring in places where we can't respond.
I have tried and failed to write a longer post since 2024, so here goes a short one with less detail.
Discourse has primarily focused on models' ability to develop new exploits against important software from scratch. That capability is impressive, but the tech industry has been dealing with people regularly finding 0-day exploits for important pieces of software for more than twenty years. Having to patch these vulnerabilities at a 10xed or even 100xed cadence for a fixed period of time is well within the resources of Mozilla, the Linux Foundation, and Microsoft. The lag time between "patch shipped" and "patch reverse engineered and weaponized by a criminal organization" was already so long that most people didn't notice new bugs when they came out. And such capabilities are dual use; defenders already have access to them and will be using the models to prevent their engineers from releasing new bugs.
There are lots of capabilities that are not like this, however:
Part of that is because lots of CVEs are inflated, but part of it is just that modern memory protections mean that it's months of hard work to actually exploit these "high" severity CVEs for your favorite product. But AI reduces that down to hours, and will even help you with worm development if you want . If the output of "stop-the-presses" vulnerabilities for SaaS vendors slows down to its 2025 cadence, but each such vulnerability is now exploited in the wild as soon as open source projects ship a patch, that seems like a lasting loss for defense.
But if you put your sociopath goggles on, the average software engineer sure has access to a lot of software repositories that could in principle be leveraged for further attacks, along with many online accounts & emails with which to abuse trust relationships. With sufficiently intelligent post exploitation, an AI could probably hop from that software engineer to >2 close friends with similar access and keep going, actively hacking each target until a large percentage of the wider graph is compromised. And once it is, AIs can be a lot more strategic & effective about making money from each target in the most profitable way.
These vectors, which have already gotten worse over the last six months, will become a more pressing issue over the next 24 months, and are more important causes for concern than vulnerability research, in part because they have no obvious solutions like vulnerability research does.
[Thanks to Chris Hacking (real name) for talking through some of these ideas in conversation with me]