This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Transformers
Edit
History
Discussion
(0)
Help improve this page
Edit
History
Discussion
(0)
Help improve this page
Transformers
Random Tag
Contributors
Posts tagged
Transformers
Most Relevant
2
86
Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
3mo
7
2
65
Residual stream norms grow exponentially over the forward pass
Ω
StefanHex
,
TurnTrout
1mo
Ω
17
2
61
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
5mo
Ω
12
2
53
How fast can we perform a forward pass?
jsteinhardt
1y
9
2
51
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
6mo
Ω
7
2
23
How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
3mo
0
1
75
An Analogy for Understanding Transformers
TheMcDouglas
1mo
5
1
44
Searching for Modularity in Large Language Models
NickyP
,
Stephen Fowler
9mo
3
1
43
Brief Notes on Transformers
Ω
Adam Jermyn
9mo
Ω
2
1
42
Building a transformer from scratch - AI safety up-skilling challenge
Ω
Marius Hobbhahn
8mo
Ω
1
1
28
We Need To Know About Continual Learning
michael_mjd
2mo
14
1
23
No Really, Attention is ALL You Need - Attention can do feedforward networks
Robert_AIZI
5mo
2
1
8
Research agenda - Building a multi-modal chess-language model
p.b.
1y
2
1
8
Addendum: More Efficient FFNs via Attention
Robert_AIZI
4mo
0
1
7
Are Mixture-of-Experts Transformers More Interpretable Than Dense Transformers?
Q
simeon_c
6mo
Q
4