x

LESSWRONG
LW

Transformers — LessWrong

Transformers

This page is a stub.

Add Posts

1

1

Posts tagged Transformers

5

37Striking Implications for Learning Theory, Interpretability — and Safety?

2y

4

3

138How LLMs are and are not myopic

3y

16

2

232Modern Transformers are AGI, and Human-Level

2y

88

2

100LLMs Can't See Pixels or Characters

7mo

44

2

87Google's PaLM-E: An Embodied Multimodal Language Model

3y

7

2

78Residual stream norms grow exponentially over the forward pass

StefanHex, TurnTrout

3y

24

2

62Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

3y

12

2

57Concrete Steps to Get Started in Transformer Mechanistic Interpretability

3y

7

2

53How fast can we perform a forward pass?

4y

9

2

33AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them

2y

9

2

27How Do Induction Heads Actually Work in Transformers With Finite Capacity?

3y

0

2

7If I ask an LLM to think step by step, how big are the steps?

1y

1

1

432Transformers Represent Belief State Geometry in their Residual Stream

2y

101

1

92An Analogy for Understanding Transformers

CallumMcDougall

3y

6

1

78Attention SAEs Scale to GPT-2 Small

Connor Kissane, robertzk, Arthur Conmy, Neel Nanda

2y

4

Load More (15/61)

Add Posts