x
Re-reasoning the Transformer and Understanding Why RL Cannot Adapt to Infinite Tasks — LessWrong