O(1) reasoning in latent space: 1ms inference, 77% accuracy, no attention or tokens
I want to share what I've been working on the last few weeks: O(1) inference across whole tasks through direct vector transformation. A few facts upfront to give you an idea of how it goes: 1. Implemented as part of a PoC of what I call the Promptable General Classifier...
I think it's appropriate for me to add some high visibility clarity in light of the commentary on why people downvoted this. I expected and was prepared for skepticism (naturally), but the lack of technical or theoretical critique surprised me. Maybe I should have stuck with my original title: "How Much Attention Do You Need, Really? Early Experiment in O(1) Reasoning In Latent Space"
I definitely should have surfaced more detail upfront so people had some idea of what they were looking at. I'll take a stab at that now:
1. Architecture: this is a novel architecture which I call a "DSRU" (Direct Semantic Reasoning Unit) that falls under the "vec2vec" category.
That means no... (read 819 more words →)