Cybersecurity of Frontier AI Models: A Regulatory Review

Elliot Mckernon

This article is part of a series of ~10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance (e.g. incident reporting, safety evals, model registries, etc.). We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.

This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.

What cybersecurity issues arise from the development of frontier AI models?

One of the primary issues that has caught the attention of regulators is the protection of the intellectual property and sensitive data associated with frontier AI models (otherwise named as “dual-use foundational models” by US legislation and “general-purpose AI” (“GPAI”) by EU legislation).

In particular, legislators are concerned that as frontier AI models increase their capabilities, unregulated access to the underlying code or abilities of these models will result in dangerous outcomes. For example, current AI models are susceptible to easily distributing information hazards, such as the instructions to develop homemade weapons or techniques to commit crimes. As a result, they’re typically trained during a fine-tuning phase to reject such requests. Bypassing the cybersecurity of such models could result in the removal of such fine-tuning, allowing dangerous requests. Other cybersecurity risks include sharing sensitive user data, or leaking proprietary ML architectural decisions with direct competitors & geopolitical adversaries (e.g. Chinese organizations, in the case of the US).

Currently, the leading frontier AI models meet the following conditions, which are often collectively referred to as “closed-source” development:

Are privately owned by a large AI lab (e.g. OpenAI, Anthropic, or Google)
Present an API interface to fine-tuned models that are designed to reject dangerous or adversarial inputs.
Do not have publicly shared training data or codebases
Do not have publicly shared model weights, which would allow for the easy replication of the core functionality of an AI model by third-parties
Encrypt and protect user data, such as LLM queries and responses

In contrast, open-source AI models typically share some combination of their training data, model code, and completed model weights for public and commercial use.

Unlike open-source models, which are freely available and lack cybersecurity protections by design, proprietary or closed-source models have stringent measures to safeguard such sensitive information. Preventing the theft or leakage of this information is critically important to the AI labs that develop these models, as it constitutes their competitive advantage and intellectual property.

What cybersecurity issues are AI labs concerned about?

Specifically, AI labs are concerned about preventing the following:

Leaking private user data would cause a company to violate key international privacy laws such as the GDPR, leading to substantial fines and loss of user trust.
Leaking the model weights of a frontier AI model would lead to external parties being able to run the model independently and remove any fine-tuning that protects from adversarial inputs.
Leaking the codebase would allow competing labs to learn directly from an organization’s technical decisions and accelerate competition.
Leaking the training data would allow competing labs to better train their models by incorporating new data, accelerating competition.

With effective security practices, it’s generally accepted that it is feasible for AI labs to prevent these forms of information being leaked. Similar practices are currently used in all major tech corporations today to prevent their existing codebases and private user data from data breaches. Nevertheless, given the complexity of cybersecurity and the numerous potential targets, it is highly likely that a prominent AI lab will fall victim to a data breach involving a frontier AI model in the near future.

What cybersecurity issues are regulators concerned about?

Regulators are similarly concerned about effective cybersecurity for the same domains, albeit with different motivations:

Regulators currently strongly prioritize the protection of user data stored by companies, as a tenet of basic privacy rights as described in binding legislation such as the GDPR or China’s Personal Information Protection Law, or non-binding declarations such as the US AI Bill of Rights’ declaration on data privacy.
Regulators are just beginning to demand adequate protection of model weights, codebase, and training data of frontier AI models, for two reasons:

Leaking such data could benefit the R&D of geopolitical adversaries. In particular, the US government is highly invested in limiting the rate of AI development of Chinese organizations - leaking such data would counter these interests.
Leaking such data could allow third-parties to develop unregulated access to potentially dangerous frontier AI models. Currently, governments have well established methods to control closed-source models run by AI labs, by regulating the labs themselves. If access to the source code of these frontier models were more widely distributed, regulators would lose their ability to control the usage and distribution of these models.

Due to these interests, regulators are generally as invested in the cybersecurity of frontier AI models as the labs themselves are. Their incentives are well aligned in the case of cybersecurity for frontier models. However, in practice regulators have by and large left specific cybersecurity decisions up to independent parties, preferring to more broadly create requirements such as a “primary responsibility for information security” or “resilien[ce] against attack from third-parties”. Their enforcement of legislation such as the GDPR has been inconsistent and patchy.

What are current regulatory policies around cybersecurity for AI models?

China

China maintains a complex, detailed, and thorough set of data privacy requirements developed over the past two decades via legislation such as the PRC Cybersecurity Law, the PRC Data Security Law, and the PRC Personal Information Protection Law. Together, they constitute strong protections mandating the confidential treatment and encryption of personal data stored by Chinese corporations. Additionally, the PRC Cybersecurity Law has requirements regarding data localization that mandate that the user data of Chinese citizens be stored on servers in mainland China, ensuring that the Chinese government has more direct methods to access and control the usage of this data. All of these laws apply to data collected from users of LLM models in China.

China’s existing AI-specific regulations largely mirror the data privacy policies laid out in previous legislation, and often refer directly to such legislation for specific requirements. In particular, they extend data privacy requirements to the training data collected by Chinese organizations. However, they do not introduce any specific requirements for the cybersecurity of frontier AI models, such as properly securing model weights or codebases.

China’s Deep Synthesis Provisions include the following:

Article 7: Requires service providers to implement primary responsibility for information security, such as data security, personal information protection, and technical safeguards.
Article 14: Requires service providers to strengthen the management and security of training data, especially personal information included in training data.

China’s Interim Generative AI Measures include the following:

Article 7: Requires service providers to handle training data in accordance with the Cybersecurity Law and Data Security Law when carrying out pre-training and optimization of models.
Article 9: Requires that service providers bear responsibility for fulfilling online information security obligations in accordance with the law.
Article 11: Requires providers to keep user input information and usage records confidential and not illegally retain or provide such data to others.
Article 17: Requires security assessments for AI services with public opinion properties or social mobilization capabilities.

The EU

The EU has a comprehensive data privacy and security law that applies to all organizations operating in the EU or handling the personal data of EU citizens: the General Data Protection Regulation (GDPR). Passed in 2018, it does not contain language specific to AI systems, but provides a strong base of privacy requirements for collecting user data, such as mandatory disclosures, purpose limitations, security, and rights to access one’s personal data.

The EU AI Act includes some cybersecurity requirements for organizations running “high-risk AI systems” or “general purpose AI models with systemic risk”. It generally identifies specific attack vectors that organizations should protect against, but provides little to no specificity about how an organization might protect against these attack vectors or what level of security is required.

Sections discussing cybersecurity for AI models include:

Article 15: High-risk AI systems should be resilient against attacks by third-parties against system vulnerabilities. Specific vulnerabilities include:
- Attacks trying to manipulate the training dataset (‘data poisoning’)
- Attacks on pre-trained components used in training (‘model poisoning’)
- Inputs designed to cause the model to make a mistake (‘adversarial examples’ or ‘model evasion’)
- Confidentiality attacks or model flaws
Article 52d: Providers of general-purpose AI models with systemic risk shall:
- Conduct adversarial testing of the model to identify and mitigate systemic risk
- Assess and mitigate systemic risks from the development, market introduction, or use of the model
- Document and report serious cybersecurity incidents
- Ensure an adequate level of cybersecurity protection

The US

Compared to the EU and China, the US Executive Order on AI places the greatest priority on the cybersecurity of frontier AI models (beyond data privacy requirements), in accordance with the US’ developing interest in limiting Chinese access to US technologies. It is developing specific reporting requirements regarding cybersecurity for companies developing dual-use foundation models, and has requests for reports out to various agencies to investigate AI model cybersecurity implications across a number of domains.

Specific regulatory text in the Executive Order includes:

Section 4.2: This section establishes reporting requirements to the Secretary of Commerce for measures taken to protect the model training process and weights of dual-use foundational models, including:

Companies developing dual-use foundation models must provide information on physical and cybersecurity protections for the model training process, model weights, and the result of any read-team testing for model security
Directs the Secretary of Commerce to define the technical conditions for which models would be subject to the reporting requirements in 4.2(a). Until defined, this applies to any model trained using
1. Over 10²⁶ integer/floating-point operations per second (FLOP/s)
2. Over 10²³FLOPs if using primarily biological sequence data
3. Any computing cluster with data center networking of over 100 Gbit/s and a maximum computing capacity of 10²⁰ FLOPs for training AI.

Section 4.3: This section requires that a report is delivered to the Secretary of Homeland Security in 90 days on potential risks related to the use of AI in critical infrastructure sectors, including ways in which AI may make infrastructure more vulnerable to critical failures, physical attacks, and cyber attacks.
- It also requests that the Secretary of the Treasury issue a public report on best practices for financial institutions to manage AI-specific cybersecurity risks.
Section 4.6: The Secretary of Commerce shall solicit input for a report evaluating the risks associated with open-sourced model weights of dual-use foundational models, including the fine-tuning of open-source models, potential benefits to innovation and research, and potential mechanisms to manage risks.
Section 7.3: The Secretary of HHS shall develop a plan [that includes the]... incorporation of safety, privacy, and security standards into the software-development lifecycle for protection of personally identifiable information, including measures to address AI-enhanced cybersecurity threats in the health and human services sector.

The US does not have a comprehensive data privacy law similar to the GDPR or the PRC Personal Information Protection Law, nor a comprehensive cybersecurity law similar to the PRC Cybersecurity Law.

Convergence’s Analysis

User data of frontier AI models, and some forms of training data will continue to fall under the jurisdiction of existing data privacy laws.

The mandatory protection of user data (such as encryption) has been well established and legislated over the past decade via legislation such as the GDPR or the PRC Personal Information Protection Law. In practice, these laws have been effective at achieving their goals. There’s no clear reason to establish a separate set of regulations solely for user data regarding AI models.
Training data used for developing AI models can sometimes include private or sensitive user data. As specified in China’s regulations, this data will also be protected under existing legislation, and specific clauses may be included to indicate that requirement.

Cybersecurity requirements beyond user privacy are likely to be targeted at a small group of leading AI labs.

As evidenced by the US Executive Order’s approach to reporting requirements on cybersecurity, the US is primarily concerned about mitigating technological poaching of leading AI models and systemic risks. It has set a reasonably high threshold for reporting, excluding all but the top 3-4 labs at this time.
The majority of companies using frontier AI models are likely to pay for access via APIs from leading AI labs, and therefore do not have many of the cybersecurity risks described above. As a result, such legislation is likely to be more targeted at a small group of AI labs and more closely enforced than data privacy laws.

Frontier AI labs already have strong incentives to enforce the protection of their closed-source AI models. It’s unlikely that mandatory legislation will meaningfully impact their cybersecurity efforts.

Leading AI labs have significant resources and technical expertise, and a strong vested interest in protecting their IP. As a result, they typically have large teams dedicated to cybersecurity, and tend to operate state-of-the-art security practices. Though these requirements seem plausible to legislate based on government interests, they are unlikely to drastically change the approach for frontier AI labs regarding cybersecurity.

Governments have historically been poor at enforcing data privacy requirements, and are mostly constrained to requiring reporting or reactively fining organizations after an incident occurs.

Practically, government agencies have not had the resources to conduct thorough audits of their cybersecurity requirements. As a result, enforcement of legislation such as the GDPR has been sporadic and inconsistent. We expect similar outcomes for cybersecurity laws around AI models.
In addition, legislative requirements around cybersecurity are intentionally vague because of their broad scope. For instance, the GDPR only requires that organizations “shall implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk”. Such wording requires that each organization be considered on a case-by-case basis and opens the case for protracted legal disputes over fines.
When securing model weights, code, and training data of frontier AI models, the types of cybersecurity required can be much more complicated, as each new domain opens up new attack vectors. Governmental agencies likely don’t have the capabilities to thoroughly evaluate the complex cybersecurity practices of frontier AI labs. However, having a significantly reduced number of organizations to track (primarily leading AI labs) may aid enforcement.

LESSWRONG
is fundraising!
LW