LESSWRONG
LW

1343
Wikitags

Transformers

This page is a stub.
Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged Transformers
5
37Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
2y
4
3
138How LLMs are and are not myopic
Ω
janus
2y
Ω
16
2
235Modern Transformers are AGI, and Human-Level
Ω
abramdemski
2y
Ω
87
2
100LLMs Can't See Pixels or Characters
Brendan Long
4mo
44
2
87Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
3y
7
2
77Residual stream norms grow exponentially over the forward pass
Ω
StefanHex, TurnTrout
3y
Ω
24
2
62Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
3y
Ω
12
2
57Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
3y
Ω
7
2
53How fast can we perform a forward pass?
jsteinhardt
3y
9
2
33AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them
Ω
Roman Leventov
2y
Ω
9
2
27How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
3y
0
2
7If I ask an LLM to think step by step, how big are the steps?
Q
ryan_b
1y
Q
1
1
426Transformers Represent Belief State Geometry in their Residual Stream
Ω
Adam Shai
2y
Ω
100
1
92An Analogy for Understanding Transformers
CallumMcDougall
3y
6
1
78Attention SAEs Scale to GPT-2 Small
Ω
Connor Kissane, robertzk, Arthur Conmy, Neel Nanda
2y
Ω
4
Load More (15/59)
Add Posts