artemg - LessWrong

The provable training part seems achievable using the existing cryptography and without requiring Confidential Computing techniques.

For example, by committing to weights at the snapshot point of time using a Hash function and also generating proof that, indeed, at least amount of T has passed between each snapshot using Verifiable Delay Functions (VDFs), we can already achieve a rate limit at how quick someone can train a model. In addition to that, to prove that we have been consistent and the next weight checkpoint was correctly produced by applying X training operations from the previous checkpoint, we can use zk-SNARKs to generate a succinct proof. As this is essentially a continuous process of proving computations, we can also apply ideas from incrementally verifiable computations (IVC), making the scheme even more efficient. I must note that SNARKs still need to be sufficiently efficient to prove extensive computations, such as training of neural networks such as GPT-4, but there are already examples of using them to prove training of GPT-2. Furthermore, as the whole scheme can be made Zero Knowledge, we can have the prover proactively submit the proofs to the verifier to ensure they are consistently compliant.

To ensure that someone cannot ignore the regulation and do training hidden, we would need to require any capable training hardware to be actively registered with the officials. Requiring an active signature from a government server to enable doing X computations could be a solution. This should be achievable as SGX and similar providers have only incremental counters or alternative solutions. Furthermore, requesting computation integrity is a far less strict property on the SGX hardware than attestation, as it does not depend on crucial extraction attacks and would require actual chip-level alterations. Considering we are already using a lot of similar gadgets around ourselves, supported by ARM TEE, this should be achievable.

What would a compute monitoring plan look like? [Linkpost]

artemg1y10

I agree that the current state of the art does indeed very much limit the throughput of the models. However, considering the current effort in scaling the space and the amount of funding it has been receiving, I would hope that in 5-10 year time this would be a more negligible slow down. After all, it has been shown that several SNARKs and STARKs can achieve linear prover time, so right now it is all about fighting the constants and getting the schemes ever more efficient. And, considering how usually slow the governments are even with urgent matters, 5-10 year period should be just right for that. For the temporary solution we could reserve to SGX and other TEE providers if we have to, and trust that the patching they constantly release would work for now.

After all, this is better than nothing, and having some, even potentially not 100% effective solutions would be good for our cause. As long as we can gradually improve their security in parallel to growing concern over the AI threat.

LESSWRONG
LW

Posts

Wiki Contributions

Comments