Where can you stick a regulatory lever to prevent improper use of GPUs at scale?
Here's one option!
Between the cost of the hardware involved and the chain of custody information, it is unlikely that many small businesses would suffer excessive burdens as a result of the regulation. A company with 16 H100's is far less of a risk than a company with 140,000 H100's; you can probably skip the audit unless you have some reason to suspect the ostensibly small installations are attempting to exploit a loophole.
GPU cloud service providers (CSPs) obscure chain of custody and complicate audits. Getting sufficient regulatory oversight here will likely have the most consumer-visible impacts.
One option would be to require any compute requests above a certain threshold (either in terms of simultaneous compute or integrated compute over time) to meet KYC-style requirements.
Regulators could lean on the CSP to extract information from large-scale customers, and those large-scale customers could be directly required to report or submit to audits as if they owned the hardware.
Given the profit margins for GPU CSPs, I suspect most major AI labs to prefer building out their own infrastructure or making special deals (like OpenAI and Microsoft). The main value of having requirements on CSPs would be to plug an obvious loophole that bad actors might otherwise try to abuse because it's still cheaper than their other options for getting access to tons of compute.
It is unlikely that any small research group would ever actually encounter the regulatory system. That helps keep regulatory overheads low and focuses the effort on the more dangerous massive deployments.
This doesn't do anything against an extremely capable hostile actor in the limit. The hardware and software are in the hands of the enemy, and no amount of obfuscation can protect you forever.
In reality, though:
This also doesn't do anything to stop fully decentralized execution across consumer-grade hardware, but efficiently scaling large training runs over the internet with current approaches seems far more challenging than in the case of, say, Folding@Home.
I don't think it's reasonable to expect the regulatory agencies to hire an army of ML experts, so it's probably going to be mostly based on metrics that can be consumed by a bureaucracy:
Physical audits would presumably collect documentation and evidence- including through access to physical systems- that reported architectures and training schemes are actually being used. The relatively sparse ML experts involved in the auditing process would likely be fed a bunch of the collected information to try to confirm that things add up. Realistically, an actively malicious corporation could slip some things past the regulators, but that's not really the threat model or goal of the regulation.
The main themes are:
In other words, collect rope with which to later hang a bad actor so as to disincentivize bad actors from bad acting in the first place.
Using signed drivers to brick rogue GPUs seems like an easy lever for regulation to use. This isn't a complete solution, nor does it preclude other options, but it seems relatively easy to implement and helpful for avoiding the most egregious types of driving off a cliff.
It's conceivable that extended regulations could be built on a similar foundation. For example, it could serve as an enforcement mechanism for restricting the amount of compute per entity to directly combat resource races.
The details could use a lot of refinement. I have no background in policy or governance, and I don't know what is legally realistic.
There are a variety of would-be competitors (plenty of smaller companies like Cerebras, but also major players from other markets like AMD and Intel), but they aren't yet taking market share from NVIDIA in the ML space.
Google's in-house TPUs are another option, and other megacorps like Microsoft, Apple, Amazon and friends have the resources to dump on making their own designs. If NVIDIA's dominance becomes a little too extractive, the biggest customers may end up becoming competitors. Fortunately, this would still be a pretty narrow field from a regulatory perspective.
The data center hardware does have other advantages in ML on a per-unit basis and isn't just vastly more expensive, but there is a reason why NVIDIA restricts the use of gaming-class hardware.
I'm pretty certain this is not a novel idea, but I think it deserves some signal boosting.
Regulation that required sending off automatic detailed technical reports of all code being executed and all data shuffled in and out would probably run into a lot of resistance, security concerns, and practical difficulties with limited upside. I could be wrong- maybe there's some use case for this kind of reporting- but it seems like "regulators physically walk into your offices/data centers and audit the whole system at random times" is both stronger and easier to get agreement on.
NVIDIA has done this before in other fields. They've developed extensions that they know their competitors don't currently handle well, then push for those extensions to be added to standardized graphics APIs like DirectX, then strongly incentivize game developers to use those features aggressively. "Wow! Look at those tessellation benchmarks, NVIDIA is sooo much better!" and such.
NVIDIA probably wouldn't be on board with this part!
A hardware protection mechanism that needs to confirm permission to run by periodically dialing home would, even if restricted to large GPU installations, brick any large scientific computing system or NN deployment that needs to be air-gapped (e.g. because it deals with sensitive personal data, or particularly sensitive commercial secrets, or with classified data). Such regulation also provides whoever controls the green light a kill switch against any large GPU application that runs critical infrastructure. Both points would severely damage national security interests.On the other hand, the doom scenarios this is supposed to protect from would, at least as of the time of writing this, by most cybersecurity professionals probably be viewed as an example of poor threat modelling (in this case, assuming the adversary is essentially almighty and that everything they do will succeed on their first try, whereas anything we try will fail because it is our first try).
In summary, I don't think this would (or should) fly, but obviously I might be wrong. For a point of reference, techniques similar in spirit have been seriously proposed to regulate use of cryptography (for instance, via adoption of the Clipper chip), but I think it's fair to say they have not been very successful.
A hardware protection mechanism that needs to confirm permission to run by periodically dialing home would, even if restricted to large GPU installations, brick any large scientific computing system or NN deployment that needs to be air-gapped (e.g. because it deals with sensitive personal data, or particularly sensitive commercial secrets, or with classified data). Such regulation also provides whoever controls the green light a kill switch against any large GPU application that runs critical infrastructure. Both points would severely damage national security interests.
Yup! Probably don't rely on a completely automated system that only works over the internet for those use cases. There are fairly simple (for bureaucratic definitions of simple) workarounds. The driver doesn't actually need to send a message anywhere, it just needs a token. Airgapped systems can still be given those small cryptographic tokens in a reasonably secure way (if it is possible to use the system in secure way at all), and for systems where this kind of feature is simply not an option, it's probably worth having a separate regulatory path. I bet NVIDIA would be happy to set up some additional market segmentation at the right price.
The unstated assumption was that the green light would be controlled by US regulatory entities for hardware sold to US entities. Other countries could have their own agencies, and there would need to be international agreements to stop "jailbroken" hardware from being the default, but I'm primarily concerned about companies under the influence of the US government and its allies anyway (for now, at least).
techniques similar in spirit have been seriously proposed to regulate use of cryptography (for instance, via adoption of the Clipper chip), but I think it's fair to say they have not been very successful.
I think there's a meaningful difference between attempts to regulate cryptography and regulating large machine learning deployments; consumers will never interact with the regulatory infrastructure, and the negative externalities are extremely small compared to compromised or banned cryptography.
The regulation is intended to encourage a stable equilibrium among labs that may willingly follow that regulation for profit-motivated reasons.
Extreme threat modeling doesn't suggest ruling out plans that fail against almighty adversaries, it suggests using security mindset: reduce unnecessary load-bearing assumptions in the story you tell about why your system is secure. The proposal is mostly relying on standard cryptographic assumptions, and doesn't seem likely to do worse in expectation than no regulation.
There is no problem with air gap. Public key cryptography is a wonderful thing. Let there be a license file, which is a signed statement of hardware ID and duration for which license is valid. You need private key to produce a license file, but public key can be used to verify it. Publish a license server which can verify license files and can be run inside air gapped networks. Done.
I think another regulatory target, particularly around the distribution of individual GPUs, would be limiting enthusiast grade hardware (as opposed to Enterprise) to something like xGB where x is obligately readjusted every year based on risk assessments
Something like this may be useful, but I do struggle to come up with workable versions that try to get specific about hardware details. Most options yield Goodhart problems- e.g. shift the architecture a little bit so that real world ML performance per watt/dollar is unaffected, but it falls below the threshold because "it's not one GPU, see!" or whatever else. Throwing enough requirements at it might work, but it seems weaker as a category than "used in a datacenter" given how ML works at the moment.
It could be that we have to bite the bullet and try for this kind of extra restriction anyway if ML architectures shift in such a way that internet-distributed ML becomes competitive, but I'm wary of pushing for it before that point because the restrictions would be far more visible to consumers.
In summary, maybeshrugidunno!