This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Transformers
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Transformers
Random Tag
Contributors
Posts tagged
Transformers
Most Relevant
3
110
How LLMs are and are not myopic
Ω
janus
4mo
Ω
10
2
86
Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
9mo
7
2
71
Residual stream norms grow exponentially over the forward pass
Ω
StefanHex
,
TurnTrout
7mo
Ω
24
2
62
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
10mo
Ω
12
2
54
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
1y
Ω
7
2
53
How fast can we perform a forward pass?
jsteinhardt
1y
9
2
23
How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
8mo
0
1
78
An Analogy for Understanding Transformers
CallumMcDougall
6mo
5
1
44
Brief Notes on Transformers
Ω
Adam Jermyn
1y
Ω
3
1
44
Searching for Modularity in Large Language Models
NickyP
,
Stephen Fowler
1y
3
1
42
Building a transformer from scratch - AI safety up-skilling challenge
Ω
Marius Hobbhahn
1y
Ω
1
1
42
GPT-2's positional embedding matrix is a helix
AdamYedidia
4mo
18
1
32
New Tool: the Residual Stream Viewer
Ω
AdamYedidia
2mo
Ω
7
1
29
We Need To Know About Continual Learning
michael_mjd
7mo
14
1
26
The positional embedding matrix and previous-token heads: how do they actually work?
Ω
AdamYedidia
4mo
Ω
4