x
Visualizing small Attention-only Transformers — LessWrong