This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
Tags
LW
$
Login
Transformers
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Transformers
Random Tag
Contributors
Posts tagged
Transformers
Most Relevant
5
37
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
1y
4
3
134
How LLMs are and are not myopic
Ω
janus
1y
Ω
16
2
219
Modern Transformers are AGI, and Human-Level
Ω
abramdemski
9mo
Ω
88
2
87
Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
2y
7
2
76
Residual stream norms grow exponentially over the forward pass
Ω
StefanHex
,
TurnTrout
2y
Ω
24
2
62
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
2y
Ω
12
2
56
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
2y
Ω
7
2
53
How fast can we perform a forward pass?
jsteinhardt
3y
9
2
33
AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them
Ω
Roman Leventov
1y
Ω
9
2
27
How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
2y
0
2
7
If I ask an LLM to think step by step, how big are the steps?
Q
ryan_b
3mo
Q
1
1
411
Transformers Represent Belief State Geometry in their Residual Stream
Ω
Adam Shai
7mo
Ω
100
1
89
An Analogy for Understanding Transformers
CallumMcDougall
2y
6
1
77
Attention SAEs Scale to GPT-2 Small
Ω
Connor Kissane
,
robertzk
,
Arthur Conmy
,
Neel Nanda
10mo
Ω
4
1
70
Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples
,
rrenaud
3mo
7