x
Model multitasking: Can a model learn two different tasks simultaneously through Grokking? — LessWrong