I wonder if this is an artifact from the training data.
There are probably more edge-case bugs in published code (or even intermediate commits) than there are obvious bugs.
A recent paper probed LLMs and located both value features (representing the expected reward) and "dopamine" features (representing the reward prediction error). These features are embedded in sparse sets of neurons, and were found to be critical for reasoning performance.
Could these findings have any implications for model welfare?
If a model had mechanisms for "feeling good and bad", I imagine they would look similar to this.
The paper in question: https://arxiv.org/abs/2602.00986
Yes, I am referring to the lack of learning-to-learn data during initial training.
Your point that humans have built-in mechanisms for continual learning is similar to what I'm saying about inductive biases: if we don't have the data to train continual learning into models, we need to build it into the architecture.
However, I think the 'data' from which humans learn during development (on-policy interactions with the environment with constant feedback and something like rewards) is much more aligned to continual learning than books and pdfs.
I believe that the biggest bottleneck for continual learning is data.
First, I am defining continual learning (CL) as extreme long-context modelling with particularly good in-context supervised and reinforcement learning. You seem to have a similar implied definition, given that Titans and Hope are technically sequence modelling architectures more than classical continual learning architectures.
Titans might already be capable of crudely performing CL as I defined it, but we wouldn't know. The reason is that we haven't trained it on data that looks like CL. The long-context data that we currently use looks like pdfs, books, and synthetically concatenated snippets. None of that data, if you saw a model producing it, would you consider... (read more)
Anecdotally, as someone who works on non-AGI-targetting AI research, I find pop-sci articles on AI research to be horribly misrepresentive.
A paper that introduces a new algorithm that guides drones around a simulator by creating sub-tasks might be presented as "AI researchers create a new kind of digital brain - and it has its own goals". That's obviously a click-bait headline, but the article itself usually does little to clean things up.
However, I would imagine that AI is currently among the worst fields for this kind of thing due to manufactured hype, culture wars, and the age-old anthropomorphization of AI algorithms.
This was the classical intuition, but turned out to be untrue in the regime of large NNs.
The modern view is double descent (https://en.wikipedia.org/wiki/Double_descent), where small models generalize better until the number of parameters exceeds the number of training examples, then larger models generalize better with the same amount of data.
But why would this error accumulation be a problem in recurrent forward passes and not one long forward pass?
I think the question is whether applying quantization to hidden states in the middle of the forward pass during both training and inference would improve performance, which your argument would seem to imply.
In this post you seem to imply that the slow training is due to a lack of parallelization, but don't MP-SAEs also require more total flops?
At each iteration you need to recompute the encoder dot products using a matmul with the encoder matrix (a look at your code confirms this), so I would think that the total flops would scale almost linearly as you increase the number of iterations.
The Searchlight Institute recently released a survey of Americans' views and usage of AI:
There is a lot of information, but the most clear take-away is that the majority of those surveyed support AI regulation.
Another result that surprises (and concerns) me is this side note:
A question that was interesting, but didn’t lead to a larger conclusion, was asking what actually happens when you ask a tool like ChatGPT a question. 45% think it looks up an exact answer in a database, and 21% think it follows a script of prewritten responses.
...