Simply reverse engineering gpt2-small (Layer 0, Part 1: Attention)
Motivation GPT2-small is able to generate, and thus apparently understand, language to a surprisingly good degree. So we’d like to find out how it works. There have been attempts to explain some parts of this network, but there remains confusion about how those parts interact with the rest of the...
For the last 2 DnDsci's, I've only gotten the tag-notification for the evaluation, but not the first post. I'd like to participate again, but I keep missing it. Anyone else having that problem?