Eric Neyman

I work at the Alignment Research Center (ARC). I write a blog on stuff I'm interested in (such as math, philosophy, puzzles, statistics, and elections): https://ericneyman.wordpress.com/

Sequences

Pseudorandomness Contest

Wikitag Contributions

Comments

Sorted by

We are interested in natural distributions over reversible circuits (see e.g. footnote 3), where we believe that circuits that satisfy P are exceptionally rare (probably exponentially rare).

Probably don't update on this too much, but when I hear "Berkeley Genomics Project", it sounds to me like a project that's affiliated with UC Berkeley (which it seems like you guys are not). Might be worth keeping in mind, in that some people might be misled by the name.

Echoing Jacob, yeah, thanks for writing this!

Since there are only exponentially many circuits, having the time-complexity of the verifier grow only linearly with  would mean that you could get a verifier that never makes mistakes. So (if I'm not mistaken) if you're right about that, then the stronger version of reduction-regularity would imply that our conjecture is equivalent to NP = coNP.

I haven't thought enough about the reduction-regularity assumption to have a take on its plausibility, but based on my intuition about our no-coincidence principle, I think it's pretty unlikely to be equivalent to NP = coNP in a way that's easy-ish to show.

That's an interesting point! I think it only applies to constructive proofs, though: you could imagine disproving the counterexample by showing that for every V, there is some circuit that satisfies P(C) but that V doesn't flag, without exhibiting a particular such circuit.

Do you have a link/citation for this quote? I couldn't immediately find it.

We've done some experiments with small reversible circuits. Empirically, a small circuit generated in the way you suggest has very obvious structure that makes it satisfy P (i.e. it is immediately evident from looking at the circuit that P holds).

This leaves open the question of whether this is true as the circuits get large. Our reasons for believing this are mostly based on the same "no-coincidence" intuition highlighted by Gowers: a naive heuristic estimate suggests that if there is no special structure in the circuit, the probability that it would satisfy P is doubly exponentially small. So probably if C does satisfy P, it's because of some special structure.

Is this a correct rephrasing of your question?

It seems like a full explanation of a neural network's low loss on the training set needs to rely on lots of pieces of knowledge that it learns from the training set (e.g. "Barack" is usually followed by "Obama"). How do random "empirical regularities" about the training set like this one fit into the explanation of the neural net?

Our current best guess about what an explanation looks like is something like modeling the distribution of neural activations. Such an activation model would end up having baked-in empirical regularities, like the fact that "Barack" is usually followed by "Obama". So in other words, just as the neural net learned this empirical regularity of the training set, our explanation will also learn the empirical regularity, and that will be part of the explanation of the neural net's low loss.

(There's a lot more to be said here, and our picture of this isn't fully fleshed out: there are some follow-up questions you might ask to which I would answer "I don't know". I'm also not sure I understood your question correctly.)

Yeah, I did a CS PhD in Columbia's theory group and have talked about this conjecture with a few TCS professors.

My guess is that P is true for an exponentially small fraction of circuits. You could plausibly prove this with combinatorics (given that e.g. the first layer randomly puts inputs into gates, which means you could try to reason about the class of circuits that are the same except that the inputs are randomly permuted before being run through the circuit). I haven't gone through this math, though.

Thanks, this is a good question.

My suspicion is that we could replace "99%" with "all but exponentially small probability in ". I also suspect that you could replace it with , with the stipulation that the length of  (or the running time of V) will depend on . But I'm not exactly sure how I expect it to depend on  -- for instance, it might be exponential in .

My basic intuition is that the closer you make 99% to 1, the smaller the number of circuits that V is allowed to say "look non-random" (i.e. are flagged for some advice ). And so V is forced to do more thorough checks ("is it actually non-random in the sort of way that could lead to P being true?") before outputting 1.

 

99% is just a kind-of lazy way to sidestep all of these considerations and state a conjecture that's "spicy" (many theoretical computer scientists think our conjecture is false) without claiming too much / getting bogged down in the details of how the "all but a small fraction of circuits" thing depends on  or the length of  or the runtime of V.

Load More