Idea-Gated Transformers: Transformers which use both System 1 and System 2 thinking.
It feels unnatural for LLMs/transformers to be intelligent while they can only generate a token at a time. The Idea-Gated transformers is about letting the transformer think in terms of ideas and not words. While it still generates one token at a time, a separate auxiliary head called the thinking...
Dec 4, 20251

