x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Transformers — LessWrong
Transformers
This page is a stub.
Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged
Transformers
Most Relevant
5
37
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
2y
4
3
139
How LLMs are and are not myopic
Ω
janus
3y
Ω
16
2
232
Modern Transformers are AGI, and Human-Level
Ω
abramdemski
2y
Ω
88
2
159
[Linkpost] Interpreting Language Model Parameters
Ω
Lucius Bushnaq
,
Dan Braun
,
Oliver Clive-Griffin
,
Bart Bussmann
,
Nathan Hu
,
mivanitskiy
,
Linda Linsefors
,
Lee Sharkey
10d
Ω
2
2
100
LLMs Can't See Pixels or Characters
Brendan Long
10mo
44
2
79
Residual stream norms grow exponentially over the forward pass
Ω
StefanHex
,
TurnTrout
3y
Ω
24
2
62
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
3y
Ω
12
2
58
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
3y
Ω
7
2
53
How fast can we perform a forward pass?
jsteinhardt
4y
9
2
33
AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them
Ω
Roman Leventov
2y
Ω
9
2
28
How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
3y
0
2
17
Training a Transformer to Compose One Step Per Layer (and Proving It)
Brendan Long
19d
0
2
16
How did ‘large’ language models get that way? The role of Transformers and Pretraining in GPT
Oliver Sourbut
12d
0
2
7
If I ask an LLM to think step by step, how big are the steps?
Q
ryan_b
2y
Q
1
1
437
Transformers Represent Belief State Geometry in their Residual Stream
Ω
Adam Shai
2y
Ω
103