x
Newbie questions about information theory and transformers — LessWrong