Learning Transformer Programs [Linkpost] — LessWrong