LESSWRONG
LW

Louis Jaburi
1240160
Message
Dialogue
Subscribe

https://cogeometry.com/
 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Lessons from the Iraq War for AI policy
Louis Jaburi3d10

And in fact south Iraq was and is dominantly Shiite (and thus also more susceptible to Iranian influence). They too revolted against Saddam after the first gulf war https://en.m.wikipedia.org/wiki/1991_Iraqi_uprisings and were euphoric about his fall

Reply
Alexander Gietelink Oldenziel's Shortform
Louis Jaburi4mo20

I agree with the previous points, but I would also add historical events that led to this.
Pre-WW I Germany was much more important and plays the role that France is playing today (maybe even more central), see University of Göttingen at the time.

After two world wars the German mathematics community was in shambles, with many mathematicians fleeing during that period (Grothendieck, Artin, Gödel,...). The university of Bonn (and the MPI) were the post-war project of Hirzebruch to rebuild the math community in Germany. 

I assume France then was then able to rise as the hotspot and I would be curious to imagine what would have happened in an alternative timeline. 

Reply2
Ambiguous out-of-distribution generalization on an algorithmic task
Louis Jaburi5mo10

In our toy example, I would intuitively associate the LLC with the test losses rather than train loss. For training of a single model, it was observed that test loss and LLC are correlated. Plausibly, for this simple model (final) LLC, train loss, and test loss, are all closely related.

Reply
Ambiguous out-of-distribution generalization on an algorithmic task
Louis Jaburi5mo10

We haven't seen that empirically with usual regularization methods, so I assume there must be something special going on with the training set up.

I wonder if this phenomenon is partially explained by scaling up the embedding and scaling down the unembedding by a factor (or vice versa). That should leave the LLC constant, but will change L2 norm. 

Reply
Against blanket arguments against interpretability
Louis Jaburi6mo10

The relevant question then becomes whether the "SGLD" sampling techniques used in SLT for measuring the free energy (or technically its derivative) actually converge to reasonable values in polynomial time. This is checked pretty extensively in this paper for example.

The linked paper considers only large models which are DLNs. I don't find this too compelling evidence for large models with non-linearities. Other measurements I have seen for bigger/deeper non-linear models seem promising, but I wouldn't call them robust yet (though it is not clear to me if this is because of an SGLD implementation/hyperparameter issue or if there is a more fundamental problem here).

As long as I don't have a more clear picture of the relationship between free energy and training dynamics under SGD, I agree with OP that the claim is too strong.

Reply
Activation space interpretability may be doomed
Louis Jaburi6mo10

I see, thanks for sharing!

Reply
Activation space interpretability may be doomed
Louis Jaburi6mo10

Did you use something like LSAE as described here ? By brittle do you mean w.r.t the sparsity penality (and other hyperparameters)?

Reply
Activation space interpretability may be doomed
Louis Jaburi6mo10

Thanks for the reference, I wanted to illuminate the value of gradients of activations in this toy example as I have been thinking about similar ideas.

I personally would be pretty excited about attribuition dictionary learning, but it seems like nobody did that on bigger models yet.

Reply
Activation space interpretability may be doomed
Louis Jaburi6mo30

Are you suggesting that there should be a formula similar to the one in Proposition 5.1 (or 5.2) that links information about the activations I(z;x)+TC(z) with the LC as measure of basin flatness?

Reply1
Activation space interpretability may be doomed
Louis Jaburi6mo30

I played around with the x2 example as well and got similar results. I was wondering why there are two more dominant PCs: If you assume there is no bias, then the activations will all look like λ∗ReLU(E)

 or λ∗ReLU(−E) and I checked that the two directions found by the PC approximately span the same space as <ReLU(E),ReLU(−E)>. I suspect something similar is happening with bias.

In this specific example there is a way to get the true direction w_out from the activations: By doing a PCA on the gradient of the activations. In this case, it is easily explained by computing the gradients by hand: It will be a multiple of w_out. 

Reply
Load More
83Ambiguous out-of-distribution generalization on an algorithmic task
5mo
6