Verification and Transparency

DanielFilan

Verification and Transparency

by DanielFilan

danielfilan.blog

2 min read8th Aug 20196 comments

35 Ω 16

Computer ScienceInterpretability (ML & AI)AI

Frontpage

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Epistemic status: I’ve thought about this topic in general for a while, and recently spent half an hour thinking about it in a somewhat focussed way.

Verification and transparency are two kinds of things you can do to or with a software system. Verification is where you use a program to prove whether or not a system of interest has a property of interest. Transparency is where you use tools to make it clear how the software system works. I claim that these are intimately related.

Examples of verification

Proving that an alleged compiler actually implements the desired semantics of a system (for example, this verified implementation of ML).
Proving that a neural network’s classifications of a set of possible inputs are invariant under small perturbations to those inputs (for example, the system described in this paper).

Example of transparency

Sharing the source code of a program, rather than just compiled machine code (as encouraged by the open-source software movement).
Demonstrating the types of inputs that neurons in a neural network are sensitive to (techniques like this are discussed in the fantastic Building Blocks of Interpretability blog post).

How verification and transparency are sort of the same

Apart from aesthetic cases, the purpose of transparency is to make the system transparent to some audience, so that members of that audience can learn about the system, and have that knowledge be intimately and necessarily entangled with the actual facts about the system. In other words, the purpose is to allow the users to verify certain properties of the system. As such, you might wonder why typical transparency methods look different than typical verification methods, which also have as a purpose allowing users to verify certain properties of a system.

How verification and transparency are different

Verification systems typically work by having a user specify a proposition to be verified, and then attempting to prove or disprove it. Transparency systems, on the other hand, provide an artefact that makes it easier to prove or disprove many properties of interest. It’s also the case that engagement with the ‘transparency artefact’ need not take the form of coming up with a proposition and then attempting to prove or disprove it: one may well instead interleave proving steps and specification steps, by looking at the artefact, having interesting lemmas come to mind, verifying those, which then inspire more lemmas, and so on.

Intermediate things

Thinking about this made me realise that many sorts of things both serve verification and transparency purposes. Examples:

Type signatures in a strongly typed language can be seen as a method of ensuring that the compiler proves that certain errors cannot occur, while also giving a human reading the program a better sense of what various functions do.
A mathematics textbook containing a large numbers of theorems, lemmas, and proofs is made by proving a large number of propositions, and allows a reader to gain an understanding of the relevant mathematical objects by perusing the theorems and lemmas, as well as by looking at the structure of the proofs.

Computer ScienceInterpretability (ML & AI)AI

Frontpage

35 Ω 16

Mentioned in

91Research agenda: Formalizing abstractions of computations

14[AN #76]: How dataset size affects robustness, and benchmarking safe exploration by measuring constraint violations

Verification and Transparency

New Comment

6 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:18 PM

[-]Rohin Shah4yΩ460

Planned newsletter summary:

This post points out that verification and transparency have similar goals. Transparency produces an artefact that allows the user to answer questions about the system under investigation (e.g. "why did the neural net predict that this was a tennis ball?"). Verification on the other hand allows the user to pose a question, and then automatically answers that question (e.g. "is there an adversarial example for this image?").

[-]Ben Pace5yΩ120

The idea I took away here is that verification and transparency are complementary, in that increasing transparency makes verification easier, and that stronger verification power removes the need for more transparency.

[-]John_Maxwell5y20

Interesting insight!

I suppose an intermediate approach would be noticing that a system comes close to satisfying some compact formal property, generating an input which violates this property, and running it by the user.

[-]justinpombrio5yΩ020

I'm confused: you say that transparency and verification are the same thing, but your examples only seem to support that transparency enables verification. Is that closer to what you were trying to say?

Type signatures in a strongly typed language can be seen as a method of ensuring that the compiler proves that certain errors cannot occur, while also giving a human reading the program a better sense of what various functions do.

Yes! A programming language does this by restricting the set of programs that you can write, disallowing both correct and incorrect programs in the process. It has to, because it's infeasible (uncomputable, to be precise) to tell whether a program will actually hit "certain errors". For example, suppose that searchForCounterexampleToRiemannHypothesis() will run forever if the Riemann Hypothesis is true, and return true if it finds a counterexample. (This is a function you could write, I think.) Then if the Riemann Hypothesis is true, this program:

if (searchForCounterexampleToRiemannHypothesis()) {
    "string" / "stringy" // type error
}

is a perfectly fine infinite loop that never attempts to divide "string" by "stringy". Nevertheless, a (static) type system will overzealously disallow this perfectly dandy program because it can't tell any better.

So type systems weaken the language. While that example was concocted, there are more realistic examples where you have to go out of your way to satisfy the conservative type checker. But, to your point (and against my point in the beginning), they weaken the language by requiring type annotations that make the language easier for the type checker to reason about (verification), and also easier for people to understand (transparency). There is a tradeoff between verification&transparency and expressiveness.

[-]DanielFilan5yΩ230

[Y]ou say that transparency and verification are the same thing

It's important to me to note that I only claimed that they are "sort of the same".

[Y]our examples only seem to support that transparency enables verification. Is that closer to what you were trying to say?

No, but you've picked up a weakness in my exposition (or rather something that I just forgot to say). Verification also enables transparency: by verifying a large number of properties of a system, one provides a 'view' for a user to understand the system, just as a transparency method can itself be thought of as verifying some properties of a system: for example, sharing the source code of a binary verifies that that source code compiles into the given binary, and that the binary when executed will use such and such memory (if the source code is written in a language that makes that explicit), etc. As such, one can think of both verification and transparency as providing artefacts that prove certain properties of systems, although they 'prove' these properties in somewhat different ways.

There is a tradeoff between verification&transparency and expressiveness.

Indeed, an important point - although in many cases, this is a plus, since sometimes precisely the thing you want to do is to make it impossible to create certain malformed systems (e.g. type systems that ensure that you never attempt to divide one string by another). As such, these methods work better when one has good reasons to rule out the class of objects that cannot be transparified/verified by them.

[-]danieldewey4yΩ110

This seems like a useful lens -- thanks for taking the time to post it!

Moderation Log