These tokens are either very common, or appear especially in reasoning tasks, in particular those with code. This might mean that coding reinforcement learning was the last step in the training process, and that all other tokens got slightly weight decayed. It could also mean that in general, reasoning tokens are treated as so important by gradient descent that their updates are extra large.
The above text is quite compelling and I am currently doing ablations on reasoning and in particular I want to prevent the model from using these reasoning words and see how the reasoning degrades, so I will definitely be citing your work when I publish my results.
Do you have any intuition on what "ocode" means?
Furthermore, it is unclear to me from which GPT OSS model you take those English L2 norm embeddings from. And lastly, can you please elaborate why having the tokenizer means we can use the GPT OSS embeddings to study the token list without having to look at each token’s text content.
I'm on day two currently and I only have 33 bugs so not sure if I will be able to sustain the entire challenge with that many and I might have to do step one again but it felt in the really nice to go through the Google Sheet and make the rows which I completed green today.
I would like to do a multi-week trial with the microhabits mentioned in this article and then report here as to the effects I perceive.
Highly underrated post!
You scale dimension 447 (the largest), because you hypothesize that it is correlated with the bos token since it has the largest activation?
More or less, I moved my habit tracking to another spreadsheet which is just check boxes and is faster to fill in, so that I wouldn't have to do the extra reflection since I couldn't guarantee to put the effort every day to do it; therefore, I have continued; albeit on a different sheet.