Calculating Natural Latents via Resampling

David Lorell

I'd be curious if you have any ideas for how it can be applied in more advanced cases, e.g. what if we want to find the natural latents in Llama?

[-]johnswentworth2y40

I expect the typical case will look like:

Find some internal signal/latent using whatever random methods someone pulled out their ass
Check whether it satisfies the naturality conditions (over some choice of variables)

... which is not what this post is about.

The material in this post is useful mainly in cases where we want to be able to rule out any "better" natural latents, which is a somewhat atypical use case. It would be relevant, for instance, if I want to design a toy environment with known natural latents in which to train some system.

(Aside: this is something I updated about relatively recently; I had previously thought of the sort of thing this post is doing as the central use-case.)

[-]tailcalled2y40

Would the checks of the naturality conditions you have in mind primarily be empirical (e.g. sampling a bunch of data points and running some statistical independence checks), or might they just as often be mechanistic (e.g. not sure how that would work for complex models like Llama but e.g. for a Bayes net you obviously already have a factorization that makes robust model independence checks much easier)?

Asking because the idea of "in some model" (plus the desire for e.g. adversarial robustness) suggests to me that we'd want to have a more mechanistic idea of whether the naturality conditions hold, but they seem easier to check empirically.

[-]johnswentworth2y40

That's a big open question which we're still figuring out.

Approximation Errors ( $ϵ$ ’s)	Known Latent	X’
Mediation	5.125482e-15	0.001046
Strong Redundancy	2.138992	2.130707
Weak Redundancy	0.030377	0.030057

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

55

Calculating Natural Latents via Resampling

55

Ω 26

55

Ω 26

Some Conceptual Foundations

What Are We Even Computing?

Why Do We Want That, Again?

Approximate “Uniqueness” and Competitive Optimality

Strong Redundancy

The Resampling Construction

Theorem 1: Strong Redundancy => Naturality

Theorem 2: Competitive Optimality

Step 1: $X_{i}$ Mediates Between $X_{j}$ and $X_{i}^{'}$

Step 2: $X^{'}$ has Weak Redundancy over $X$

Step 3: $Λ$ Mediates between $X$ and $X^{'}$

Step 4: Strong Redundancy of $X^{'}$

Can You Do Better?

Empirical Results (Spot Check)

55

Calculating Natural Latents via Resampling

55

Ω 26

55

Ω 26

Some Conceptual Foundations

What Are We Even Computing?

Why Do We Want That, Again?

Approximate “Uniqueness” and Competitive Optimality

Strong Redundancy

The Resampling Construction

Theorem 1: Strong Redundancy => Naturality

Theorem 2: Competitive Optimality

Step 1: Xi Mediates Between Xj and X′i

Step 2: X′ has Weak Redundancy over X

Step 3: Λ Mediates between X and X′

Step 4: Strong Redundancy of X′

Can You Do Better?

Empirical Results (Spot Check)

Step 1: $X_{i}$ Mediates Between $X_{j}$ and $X_{i}^{'}$

Step 2: $X^{'}$ has Weak Redundancy over $X$

Step 3: $Λ$ Mediates between $X$ and $X^{'}$

Step 4: Strong Redundancy of $X^{'}$