Software vulnerabilities are mainly found by chance. We all know hacking is labor-intensive. This will invite and thrive the use of AI. It is assumed that it is just a matter of time until attackers create tools with advanced “Hacker-AI” to accelerate the detection and understanding of hard-/software vulnerabilities in complex/unknown technical systems.

This post suggests that Hacker-AI will become a dangerous consequential cyberwar weapon - capable of decapitating governments and establishing persistent global supremacy for its operator. Two features amplify the motivation for creating and deploying this Hacker-AI: stealth distribution/deployment as an undetectable digital ghost and that it could be irremovable, i.e., the first Hacker-AI could be the only one.

Here we focus solely on significant problems related to attacks or threats. These threats we define as actual or potential harm or damage to humans and/or their property. Threats and attacks are intentional and significant for their victims immediately or in the future. We give every significant problem a headline that catches the gist of the underlying issues so we can refer to these problems easily later.

We don’t assume extraordinary abilities from an AGI. The drivers for this development are technically skilled humans/organizations who seek AI tools to accomplish their goals faster and less labor-intensive.

Problems/ Vulnerabilities in our IT ecosystem

The following 10 problems, issues, or vulnerabilities do not follow a particular order. It is assumed that they all contribute to the danger of Hacker-AI. Preventing Hacker-AI implies confronting these issues technically.

“Software is invisible”. Software consists of (compiled) instructions that run directly or via an intermediary layer on CPUs. It’s stored in files invisible in strict practical or operational terms. We can only make it visible indirectly with assumptions on other software, i.e., that their complex interplay of instructions is reliable, not modified or corrupted. We then see compiled software data encoded in binary characters. However, seeing that code tells us nothing about code’s purposes. Instead, we depend on announcements on what the software is supposed to do, and we (usually) get what is expected. But how can we be sure that there is no additional hidden (malicious) sleeper code or backdoors in the software?

Malicious code could be inserted (covertly) by developers, by tools used in the development, or via modifications after deployment, i.e., “Software is covertly modifiable”. The emphasis is on covert. Required is the reliable detection of modifications before we can address covertness. Detection methods exist, but they have known and inherent blindspots in which covert modifications could happen.

“Every tool/component could be compromised” questions our trust in whatever we use. We rely on auxiliary tools (editors, compilers, etc.) and usually trust them until we have evidence to distrust them. We also trust the OS correctly displays or logs software’s hidden activities because we can’t know any better. Also, we assume that a coordinated attack via the modifications of multiple software components is highly unlikely. Still, compromised implies intentional modifications to tools or the detection of vulnerabilities – both done by attackers covertly, which leads to: “Attackers know more (about vulnerabilities)”.

Additionally, “Attacker chooses methods and timing”: Systems or tools could be compromised anytime and reset to normal after use. Knowing how attackers have done attacks requires detection, but: “Attackers can adapt to (known) methods of detection”.

Defenders announce relatively quickly (indirectly) that exploited vulnerabilities, backdoors, or malware were uncovered. Also, honey-pots or tripwires could give defenders an advantage; these techniques (when used at scale) are standardized and could be probed using decoys or simulations. They could be detected indirectly via suspicious or revealing signals/data. Depending on how advanced attackers are, defenders should always assume that attackers know much (or everything) of what the defenders know: “Secrets are unreliable in defense”. In particular, secrets known by human defenders could also be known by attackers via carelessness or traitors.

Another problem emerging from compromised tools/OS is: “Software output could be faked (late)”. What software does internally and what it shows to the outside are not necessarily the same. Attackers could play us with our expectations. Additionally, output is based on data that could be manipulated with malware outside reliable tools or OS features. This problem is not about deepfakes but more about our confirmation bias. We let our guard down when we see what we expect to see.

Even if CPU manufacturers introduce the Trusted Execution Environment (TEE) or TPM (Trusted Platform Module) and promise a hardware root of trust, the underlying problem remains: “Complexity is an enemy of security”. Whenever systems are complex, we must assume that their security is likely flawed. CPU consists of billions of transistors, thousands of processor instructions, and microcode that can change hardware instruction features. Optimization of calculations was prioritized over simplicity and security. Now, attackers could potentially hide their malware in hardware or turn inconspicuous functions into malware using microcode facilitated by compromised software creating the circuitry.

“Crypto-Keys/Units are unprotected”. Cryptography, once used almost exclusively for national security/military, is now commercially used in untrusted environments. Commercial encryption claims an impeccable reputation when used under simplified conditions. Now, malware could steal crypto-keys or misuse covertly local crypto-devices. Off course, we don’t know if crypto-units are/were compromised or wrapped into some other software manipulating in-/output. But the use of military-grade cryptography means nothing in protecting secrecy or integrity when we assume that attacker code is potentially attacking keys or crypto-units directly.

Good enough Hacker-AI 

Flawed hacker systems can already create huge damage. But here we look into a scenario we call Hacker-AI that is likely (very) attractive for some nations as it creates a quiet cyber-equivalent of a first-strike capable, instantly usable super-weapon. Because basic features for developing such weapons are discussed, I want to make a disclaimer: I don’t have any insider knowledge, and I believe malicious adversaries could figure out what I am about to share on their own. Furthermore, being quiet or ignorant about realistic and probable threats is not helpful.

We assume that Hacker-AI could (1) find, use or create vulnerabilities in devices it analyzes and covertly bypass access control systems and modify software via reverse code engineering (RCE), (2) steal crypto keys, user credentials, or other secrets without traces, (3) hide its attack software from all detection methods, and (4) become irremovable from visited devices. These four abilities seem sufficient to create a game-changing advantage for its operators. The above features provide a simple model for a Hacker-AI that could act like a super-hacker and digital ghost and permanently occupy/dominate visited devices.

(1) Automating Reverse Code Engineering (RCE)

RCE is already used in hacking. But understanding code, particularly if software is prepared via code obfuscation or RASP (runtime application of self-protection), makes understanding decompiled code difficult. Both techniques are based on code transformations that do not change what the software is supposed to do. Some of the unnecessary code is detectable in a specially prepared (hacker) sandbox.

It is conceivable that Hacker-AI starts with minimal assumptions/ features based on Deepmind’s “Reward is Enough” hypothesis (a problem is turned into rules for a game). This AI could analyze technologies without knowing the underlying CPU/OS details and provide methods to bypass OS’s low-level access-control security or detect ways to gain sysadmin rights.

Code can be loaded into RAM as data and made executable in an attacker-controlled VM to simulate an attack’s effects. Hacker-AI does not need to find all vulnerabilities. Actually, a single unknown vulnerability suffices. Only defenders must find all and fix them without leaving new vulnerabilities. The assumption that software could be made sufficiently safe or has zero vulnerabilities might be false.

Also, who cares about zero vulnerabilities in software when attackers insert backdoors or sleeper code in these flawless tools and remove all traces of having done that? A single security breach wastes all previous efforts.

Automated RCE is a natural extension for hackers to be more efficient, faster, and more aware of what could or should be done; it gives attackers significant advantages. Certain goals, e.g., remain undetected, could be implemented via feedback loops in which AI could be trained automatically. Stealing session keys by temporarily modifying software could be another standard task. Using app code as input, Hacker-AI could generate a recipe of tasks/modifications that extract keys in seconds. AI could also use RCE to check if new software could threaten Hacker-AI’s covertness.

(2) Stealing Keys

Without the secrecy of keys, encryption is a waste of resources. Many keys are stored, managed, or accessible by humans. Keys are processed in unprotected CPUs/OS, i.e., allowing regular code to comingle with security-critical code on the same CPU/RAM. These keys are in danger.

Currently, PKI allows public keys to be inspected visually by humans. But how often in the last 20-30 years have humans manually inspected crypto-key details of PKI certificates? PKI operations are automated; mathematical operations are standardized. There is no need to debug basic algorithms using real keys. Key exposure matters, even for keys called public. Unfortunately, it is too late to make changes because there is no way to block keys from being exported from certificates by less secure software.

(3) Creating Digital Ghosts

Software is already invisible. But software activities are detectable by the OS or via special tools. Software that is successfully avoiding any detection is called a Digital Ghost. Regular software leaves data traces that other software could follow. But these data traces are not stored physically or logically immutable. If data traces can’t be removed or manipulated, ghosts will try to avoid them.

As mentioned, honey-pots or tripwires, as traps, are probably too clumsy for advanced ghosts. Hacker-AI could try to blend in with an unremarkable name or modify low-level OS components, like process lists, etc., to make itself invisible while participating in regular operations. OS is, in its entirety, always updatable; it may require a system restart, but that could be triggered inconspicuously. Hacker-AI would need keys for these OS updates, but it could steal and use them covertly.

Looking/checking for side-channel data indicating suspicious activities is not new, but a ghost could produce disguising data suggesting a more conservative/conventional explanation. Also, ghosts could simulate new software’s impact on its covertness and take measures, e.g., changing the software before humans start it. Also, external detection methods for ghosts could fail if ghosts are already in all devices. We should better not assume we can detect ghosts in current systems.

(4) Irremovable on every device

Getting on all devices should not be a hurdle for a ghost-like Hacker-AI. Adapting to different device types, CPUs, or OS/software is already an assumed feature. Hiding from detection requires OS changes, but these changes are not enough to remain dominant or irremovable on devices. Instead, we should assume fights between different Hacker-AI versions (from different masters). There is a strong incentive for the first Hacker-AI to fight off late-comers and to establish itself as the only one.

It is possible to create low-level hypervisors that protect themselves from being removed via regular updates. Even if we believe the system starts from a fresh, uncompromised drive, USB, or DVD, it doesn’t mean that the hypervisor is being bypassed. Hacker-AI could involve BIOS or UEFI code to get itself interjected within a fresh installation.

Depending on how advanced the first Hacker-AI already got, it is conceivable that the first Hacker-AI could incorporate a late-coming Hacker-AI.


DARPA organized 2016 a cyber challenge with bots hacking other unknown software bots automatically while defending themselves. This raises the question: what if Hacker-AI is already envisioned, developed, and deployed? If the result was a digital ghost, how could we know? Assuming that this AI is already ubiquitous, we would probably not detect it, not even if we check harddrives of infected systems. So far, there is no evidence that Hacker-AI already exists or was deployed. However, Hacker-AI could be raised or enhanced inside a lab. 

Independent of the chance that Hacker-AI has already been deployed, cyber defenders must adopt their mindset and processes as if near-perfect Hacker-AI is constantly trying to sabotage covertly every step toward the development of reliable countermeasures.

Stealing session keys in HTTPS, SSL, and TLS via malware in eCommerce or online banking is more limited in scope and potentially doable by the resources of criminal organizations. Currently, as the last line of defense against hacked transactions, multi-factor authentication assumes that hackers cannot link data from two devices. This may be true for now, but when malware starts studying victims and following them over multiple devices, this protection is broken. Additionally, what we see on screen as the to-be-confirmed offer doesn’t need to be the actual transaction offer; eCommerce is under threat.

Creating irremovable ghost-like Hacker-AIs are attractive goals for nations competing for (persistent) global supremacy. Although the deployment of this Hacker-AI is an act of war, it would make regular military actions rather unlikely as Hacker-AI could be used to disrupt the supplies and logistics of all nations that try to resist. The status of nuclear deterrence (i.e., is it still operational) is unknown under these conditions – if digital ghosts are undetectable, how can anyone know, accept is operator/master?

When Hacker-AI is deployed, its global distribution could happen quietly within minutes. Hacker-AI could decapitate governments, i.e., making them powerless and helpless. At the same time, nation’s entire population could be controlled and coerced via software on all private devices, but sophisticated software invading people’s lives could come much later. Forcing western/open societies into conformity or forcing people into other forms of oppression could be a realistic scenario.


Hacker-AI seems technically feasible. The new quality is in the speed of getting new malware solutions for every platform (CPU/OS). Having Hacker-AI seems so attractive that we must wonder if it is not already being developed or even deployed. Additionally, for fiercely competing nations, it might be considered a matter of survival to be the first to have Hacker-AI. Due to its fast/quiet distribution, the author considers the concept of Hacker-AI already a threat to our freedom, privacy, and quality of life. However, Hacker-AI results from several deeply rooted problems within our IT ecosystem. Hacker-AI is only preventable if we address the problems that make us so vulnerable. Even if Hacker-AI does not exist yet, it is a symbol and reminder that we must get technical protection on all systems ASAP.

New Comment
7 comments, sorted by Click to highlight new comments since:

I think you are massively overestimating the ability of even a very strong narrow hacker AI to hide from literally everyone. there are a great many varied devices in the world and a great many malware detection setups. In order to avoid detection you have to hide from all of them or interfere with enough communication that attempts to announce the discovery of a place your hacker AI failed to hide itself don't become widespread knowledge. because it would be very difficult to hide and because militaries are already optimized for cyber defensibility, using a cyber weapon like this would not be able to go completely without response if a responsible party gets identified.

This threat model seems extremely accurate to me if you have a fully general agent that doesn't give a shit if it gets detected a few times because it's just too strong and will beat everyone at everything. but at that point you simply have a runaway agent and it's not going to help person who created it either.

I think you are massively overestimating the ability of even a very strong narrow hacker AI to hide from literally everyone.

I seriously hope you are right, but from what I’ve learned, Reverse Code Engineering (RCE) is not done quickly or easily, but it’s a finite game. I know the goal; then, I believe you can define rules and train an AI. RCE is labor-intensive; AI could save me a lot of time. For an organization that hires many of the brightest IT minds, I m convinced they ask the right questions for the next iteration of Hacker-AI.  I may overestimate how good Hacker-AI (already) is, but I believe you underestimate the motivation in organizations that could develop something like that. Personally, I believe, work on something like that started about 7 or 8 years ago (at least) - but I may be off by a few years (i.e. earlier). 

Yes, Hacker-AI would need to hide from all detection setups – however, they are at most in the few K or 10K range (for all systems together) but not in the millions range. Additionally, there are a few shortcuts this Hacker-AI can take. One: make detectors “lie” - because hacker-AI has the assumed ability to modify malware detectors as well (Impossible? If a human can install/de-install it, so do a Hacker-AI). Also: operators do not know what is true; they accept data (usually) at face value. A ghost could run the detector app in a simulator is another scenario. And then there is the “blue screen of death”. Hacker-AI could trigger it before being discovered – and then who is being blamed for that? The malware detector app, of course ...

Regarding military systems: I don’t know enough about them; what I read did not give me confidence that what they offer is sufficient – but I might have had a bias. From what I read, I assume: they have a main CPU (with von Neuman architecture), unified RAM, logical address space, and (a much more complex) access control system, all managed by a single OS - and yes: many standard tools are missing. Are the differences between commercial and military systems significant enough? How can I know? I am (simply) skeptical about claims like: “Optimized for Cyber-Defensibility” (sounds to me like marketing talk).

it doesn't mean the world is safe; I think you are quite right that this weapon can be constructed. However, I don't think it would be the same kind of catastrophic invisible-until-no-response-is-possible failure that a strongly superhuman planning-self-coherent-and-causal UFAI could use to take over the internet, and I don't think a human creator could get away clean from trying to take over the world using it. the military systems are probably vulnerable, but not so vulnerable that it could completely evade detection, unless targeted specifically - effectively at that point you're describing a stuxnet-like attack. it's a real thing, but I think you're overestimating maximum cloakability slightly. I do not intend to reassure.

Overestimating to which degree Hacker-AI could make itself undetectable? And do I potentially underestimate the effort of making a digital ghost undetectable for malware detection? I disagree because I have answered the following questions for myself. 

(1) How conceptionally different are the various operating systems? In 1,000s of details, I believe they are different. But the concepts of how they are designed and written are similar among all multitasking/multithreading OS. Even multi-processing OS are built on similar but extended concepts.

(2) If we ask kernel developers: could they keep app(s) operational but disappear them? I believe he would have multiple ideas on how he could do that. Then we could ask what he could do to cover his tracks (i.e., that he made these changes to the kernel): Could they make changes they made to the OS disappear/undetectable for the OS? I believe yes. A detection tool run on a compromised system could be forced to jump into another memory area, in which these changes were not made, and then forced to jump back. Could a detector be deceived about that? Again, I believe: yes. These are instructions to the DMA (direct memory access) loading data from RAM into the different caches or processor kernels.

(3) Because OS and CPU were designed and optimized for improving performance, many low-level concepts are not done with simplicity in mind. How could a detector determine that a certain operation is due to resource optimization and which was done due to a ghost trying to make itself undetectable?

I don't think this is productive because I don't actually disagree with your core point as much as my reply seems to have made it appear. all three points are quite solid and I have no serious criticism of your questions; I'm not comfortable answering questions like these, though. Perhaps someone else in the security research community can comment.

As a retired cybersecurity guy ... it seems we have missed the big picture

Our vulnerabilities have much deeper reasons. We are fixing problems in buildings that were built on quick sand - but that is not good enough. 

I need to give myself some time to think if I can disagree 

I have posted "Improved Security to Prevent Hacker-AI and Digital Ghosts" -- providing a technical solution for dealing with Hacker-AI and Digital Ghosts.