LESSWRONGTags
LW

Transformers

EditHistory
Discussion (0)
Help improve this page
EditHistory
Discussion (0)
Help improve this page
Transformers
Random Tag
Contributors
Posts tagged Transformers
2
86Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
3mo
7
2
65Residual stream norms grow exponentially over the forward passΩ
StefanHex, TurnTrout
1mo
Ω
17
2
61Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind Ω
DragonGod
5mo
Ω
12
2
53How fast can we perform a forward pass?
jsteinhardt
1y
9
2
51Concrete Steps to Get Started in Transformer Mechanistic InterpretabilityΩ
Neel Nanda
6mo
Ω
7
2
23How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
3mo
0
1
75An Analogy for Understanding Transformers
TheMcDouglas
1mo
5
1
44Searching for Modularity in Large Language Models
NickyP, Stephen Fowler
9mo
3
1
43Brief Notes on TransformersΩ
Adam Jermyn
9mo
Ω
2
1
42Building a transformer from scratch - AI safety up-skilling challengeΩ
Marius Hobbhahn
8mo
Ω
1
1
28We Need To Know About Continual Learning
michael_mjd
2mo
14
1
23No Really, Attention is ALL You Need - Attention can do feedforward networks
Robert_AIZI
5mo
2
1
8Research agenda - Building a multi-modal chess-language model
p.b.
1y
2
1
8Addendum: More Efficient FFNs via Attention
Robert_AIZI
4mo
0
1
7Are Mixture-of-Experts Transformers More Interpretable Than Dense Transformers? Q
simeon_c
6mo
Q
4
Load More (15/19)
Add Posts