x
Probability that other architectures will scale as well as Transformers? — LessWrong