Zero-Knowledge Cooperation

[-]MrMind8y60

There's a detail I've not understood: what is that A passes back to B? The encrypted output or the decrypted output? If it's the decrypted output of the Validator, how does B verify that the signature is correct, since the Validator signed the enrcypted output?

[-]philh8y40

If I understand you and FHE correctly: that's what the FHE is doing.

A calculates msg_1 = Encrypt(A_source, A_key) to B
B sends msg_2 = Sign(Validate(msg_1), B_key) to A
A sends msg_3 = Decrypt(msg_2, A_key) to B. By the magic of FHE, msg_3 == Sign(Validate(A_source), B_key), even though A doesn't know B_key and B doesn't know A_source.

[-]bryjnar8y20

Pretty much! Expanding your explanation a little:

A sends msg_1 = Encrypt(A_source, A_key), and sends that to B
B wants to run Validate(source) = Sign(Check_trustworthy(source), B_key) on A_source, but can't do that directly because B only has an encrypted version.
1. So B runs Validate under FHE on msg_1, producing msg_2 = Encrypt(Validate(A_source), A_key), and sends that to A.
A decrypts msg_2, producing msg_3 = Validate(A_source) = Sign(Check_trustworthy(A_source), B_key), and sends that back to B (if it meets the agreed-on format).
B has a claim that A's source is trustworthy, signed by B's key, which A can't have, so it must have been produced by B's program.

Step 2.1 is where the magic happens.

(I should have just put this in the post!)

[-]kvas8y30

I'm not sure I'm completely solid on how FHE works, so perhaps this won't work, but here's an idea of how B can exploit this approach:

Let's imagine that Check_trustworthy(A_source) = 1. After step 3 of the parent comment B would know E1 = Encrypt(1, A_key). If Check_trustworthy(A_source) returned 0, B would instead know E0 = Encrypt(0, A_key) and the following steps works similarly. B knows which one it is by looking at msg_3.
B has another program: Check_blackmail(X, source) that simulates behaviour of an agent with the given source code in situation X and returns 1 if it would be blackmailable or 0 if not.
B knows Encrypt(A_source, A_key) and they can compute F(X) = Encrypt(Check_blackmail(X, A_source), A_key) for any X using FHE properties of the encryption scheme.
Let's define W(X) = if(F(X) = E1, 1, 0). It's easy to see that W(X) = Check_blackmail(X, A_source), so now B can compute that for any X.
Profit?

[-]bryjnar8y30

I think your example won't work, but it depends on the implementation of FHE. If there's a nonce involved (which there really should be), then you'll get different encrypted data for the output of the two programs you run, even though the underlying data is the same.

But you don't actually need to do that. The protocol lets B exfiltrate one bit of data, whatever bit they like. A doesn't get to validate the program that B runs, they can only validate the output. So any program that produces 0 or 1 will satisfy A and they'll even decrypt the output for you.

That does indeed mean that B can find out if A is blackmailable, or something, so exposing your source code is still risky. What would be really cool would be a way to let A also be sure what program has been run on their source by B, but I couldn't think of a way to do this such that both A and B are sure that the program was the one that actually got run.

[-]Commander Zander8y40

A similar algorithm appears in Age of Em by Robin Hanson ('spur safes' in Chapter 14). Basically, a trusted third party allows copies of A and B to analyze each other's source code in a sealed environment, then deletes almost everything that is learned.

A and B both copy their source code into a trusted computing environment ('safe'), such as an isolated server or some variety of encrypted VM. The trusted environment instantiates a copy of A (A_fork) and gives it B_source to inspect. Similarly, B_fork is instantiated and allowed to examine A_source. There can be other inputs, such as some contextual information and a contract to discuss. They examine the code for several hours or so, but this is not risky to A or B because all information inside the trusted environment will mandatorily be deleted afterwards. The only outputs from the trusted environment are a secure channel from A_fork to A and one from B_fork to B. These may only ever output an extremely low-resolution one-time report. This can be one of the following 3 values: 'Enter into the contract with the other', 'Do not enter into the contract with the other', or 'Maybe enter the contract'.

This does require a trusted execution environment, of course.

I don't know if this idea is original to Hanson.

[-]bryjnar8y10

I haven't read Age of Em, but something like "spur safes" was an inspiration (I'm sure I've come across the idea before). My version is similar except that

It's stripped down.
1. B only needs to make a Validator, which could be a copy of themself, but doesn't have to be.
2. It only validates A to B, rather than trying to do both simultaneously. You can of course just run it twice in both directions.
You don't need a trusted computing environment.

I think that's a pretty big deal, because the trusted computing environment has to be trusted enough to run its end of A/B's secure channels. In order for A/B to trust the output, it would need to e.g. be signed by their private keys, but then the computing envionment has access to those keys and can do whatever it wants! The trick with FHE is to let B run a computation using their secret key "inside" the safe without letting anyone else see the key.

Even if you have good security procedures you still want to do this. You should install fail2ban even if you think SSH is secure. ↩
There is, of course, the small matter of proving that the source code you provided is the code you’re actually running, but this is independent of the security problem. ↩
This protocol actually allows B to extract one bit of information of any kind from A’s secret data, so could be used for other purposes too. As far as I know it’s novel. ↩
FHE is prohibitively expensive to perform at the moment, but we’re looking for a possibility proof here. Yes, I’m using a very large hammer to crack this nut. ↩

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

16

Zero-Knowledge Cooperation

16

16

Security needs of decision algorithms

Threats and capitulation thresholds

All data is sensitive

Zero knowledge cooperation

Private histories

Secret source-code sharing

Conclusion