1 ATTENTION GATHERS, MLPS COMPOSE: A CAUSAL ANALYSIS OF AN ACTION-OUTCOME CIRCUIT IN VIDEOVIT

11th Oct 2025

8 min read

1

This post was rejected for the following reason(s):

This is an automated rejection. No LLM generated, heavily assisted/co-written, or otherwise reliant work. An LLM-detection service flagged your post as >50% likely to be written by an LLM. We've been having a wave of LLM written or co-written work that doesn't meet our quality standards. LessWrong has fairly specific standards, and your first LessWrong post is sort of like the application to a college. It should be optimized for demonstrating that you can think clearly without AI assistance.

So, we reject all LLM generated posts from new users. We also reject work that falls into some categories that are difficult to evaluate that typically turn out to not make much sense, which LLMs frequently steer people toward.*

"English is my second language, I'm using this to translate"

If English is your second language and you were using LLMs to help you translate, try writing the post yourself in your native language and using a different (preferably non-LLM) translation software to translate it directly.

"What if I think this was a mistake?"

For users who get flagged as potentially LLM but think it was a mistake, if all 3 of the following criteria are true, you can message us on Intercom or at team@lesswrong.com and ask for reconsideration.

you wrote this yourself (not using LLMs to help you write it)
you did not chat extensively with LLMs to help you generate the ideas. (using it briefly the way you'd use a search engine is fine. But, if you're treating it more like a coauthor or test subject, we will not reconsider your post)
your post is not about AI consciousness/recursion/emergence, or novel interpretations of physics.

If any of those are false, sorry, we will not accept your post.

* (examples of work we don't evaluate because it's too time costly: case studies of LLM sentience, emergence, recursion, novel physics interpretations, or AI alignment strategies that you developed in tandem with an AI coauthor – AIs may seem quite smart but they aren't actually a good judge of the quality of novel ideas.)

Interpretability (ML & AI)Machine Learning (ML)AI

1

New Comment

Moderation Log

Prediction Rank	Original Predictions (Logit)	Ablated Predictions (Logit)	Change in Logits
1	LABEL 31 (16.6881)	LABEL 31 (16.8257)	0.1376
2	LABEL 357 (8.5500)	LABEL 357 (8.4932)	-0.0569
3	LABEL 84 (6.1969)	LABEL 84 (6.1552)	-0.0417
4	LABEL 45 (6.1204)	LABEL 45 (6.1848)	0.0645
5	LABEL 227 (6.0944)	LABEL 227 (6.0417)	-0.0527
6	LABEL 101 (5.8890)	LABEL 101 (5.9120)	0.023

Layer	Component	Label	Causal Effect (% Recovery)
4	Attention	L4 Attention	57.08
4	MLP	L4 MLP	101.88
5	Attention	L5 Attention	59.78
5	MLP	L5 MLP	101.88
6	Attention	L6 Attention	45.44
6	MLP	L6 MLP	101.88
7	Attention	L7 Attention	49.13
7	MLP	L7 MLP	101.88
8	Attention	L8 Attention	49.7
8	MLP	L8 MLP	101.88
9	Attention	L9 Attention	53.63
9	MLP	L9 MLP	101.88
10	Attention	L10 Attention	64.18
10	MLP	L10 MLP	101.88

LESSWRONG
LW

LESSWRONG
LW

1

ATTENTION GATHERS, MLPS COMPOSE: A CAUSAL ANALYSIS OF AN ACTION-OUTCOME CIRCUIT IN VIDEOVIT

1

1