IIUC, the good regulator theorem doesn't say anything about how the model of the system should be represented in the activations of the residual stream. I think the potentially surprising part is that the model is recoverable with a linear probe.

LessOnline Festival Updates Thread

kave9d20

(I would agree-react but I can't actually make it)

Ironing Out the Squiggles

kave11d1515

It seems unlikely that different hastily cobbled-together programs would have the same bug.

Is this true? My sense is that in, for example, Advent of Code problems, different people often write the same bug into their program.

Arjun Panickssery's Shortform

kave11d20

"Crucial to our disagreement" is 8 syllables to "cruxy"'s 2.

"Dispositive" is quite American, but has a more similar meaning to "cruxy" than plain "crucial". "Conclusive" or "decisive" are also in the neighbourhood, though these are both feel like they're about something more objective and less about what decides the issue relative to the speaker's map.

An Unintentional Compliment

kave12d52

D&D.Sci forces the reader to think harder than anything else on this website

D&D.Sci smoothly entices me towards thinking hard. There's lots of thinking hard that can be done when reading a good essay, but the default is always to read on (cf Feynman on reading papers) and often I just do that while skipping the thinking hard.

D&D.Sci

kave13d191

Curated! This kicked off a wonderful series of fun data science challenges. I'm impressed that it's still going after over 3 years, and that other people have joined in with running them, especially @aphyer who has an entry running right now (go play it!).

Thank you, @abstractapplic for making these. I don't think I've ever submitted a solution, but I often like playing around with them a little (nowadays I just make inquiries with ChatGPT). I particularly like

That it nuanced my understanding of the supremacy of neural networks and when "just throw a neural net" at it might work or might not.

Here's to another 3.4 years!

Examples of Highly Counterfactual Discoveries?

kave14d20

Maybe "counterfactually robust" is an OK phrase?