You say “reversible computation can implement PSPACE algorithms while conventional computers can only implement algorithms in the complexity class P.” This is not true in any interesting sense. What class a problem is in is a statement about how fast the space and time increase as the size of the problem increases. Whether a problem is feasible is a statement about how much time and space we’re willing to spend solving it. A sufficiently large problem in P is infeasible, while a sufficiently small problem in PSPACE is feasible. For example, playing chess on an n by n board is in PSPACE. But 8 is small enough that computer chess is perfectly feasible.
There’s no point to my remaining secretive as to my guess at the obstacle between us and superhuman AI. What I was referring to is what Jeffery Ladish called the “Agency Overhang” in his post of the same name. Now that there’s a long and well-written post on the topic, there’s no point in me being secretive about it ☹️.
I did a little research and this seems to be true, at least if we restrict it to "If we had invented chloroflourocarbons in 1800, and used them as vigorously as we did in real life, we would have severely depleted animal and plant life outside the tropical zone." Our hypothetical air-conditioned Victorians would observe a steady increase in harmful ultraviolet light, spreading from the poles. But the cause would remain a mystery. In our timeline, the ozone layer depletion was discovered by satellite, but could have been detected from the ground if people had been measuring. So we wouldn't have to wait until spaceflight for it to be discovered. But the mechanism by which chloroflurocabons deplete ozone is quite beyond nineteenth century chemistry, and furthermore the ozone layer would not even be discovered until 1913. So they would have kept on using them for many decades. Our fifty years of use depleted the ozone layer over the poles (where it is thickest) by about twofold, but over the equator by only ten percent. In the other time line, it seems reasonable to extend this for 150 years, to a tenfold increase in CFC concentration. This would lead to roughly a ninety percent depletion over the poles and in the temperate zones, and a factor of two over the tropics. This would have catastrophic effects on all life on land or in the shallow ocean. Eventually the economy would collapse and CFC production would decrease, but since it lasts for many decades in the atmosphere, this would not get better for a long time.
Transformers take O(n^2) computation for a context window of size n, because they effectively feed everything inside the context window to every layer. It provides the benefits of a small memory, but it doesn’t scale. It has no way of remembering things from before the context window, so it’s like a human with a busted hippocampus (Korsakoff’s syndrome) who can‘t make new memories.
Models with long-term memory are very hard to train. Instead of being able to compute a weight update after seeing a single input, you have to run in a long loop of ”put thing in memory, take thing out, compute with it, etc” before you can compute a weight update. It’s not a priori impossible, but nobody’s managed to get it to work. Evolution has figured out how to do it because it’s willing to waste an entire lifetime to get a single noisy update.
People have been working on this for years. It’s remarkable (in retrospect, to me) that we’ve gotten as far as we have without long term memory.
When I had a stroke, and was confronted with wildly out-of-distribution visual inputs, one of the first things they did was to put me in a dark predictable room. It was a huge relief, and apparently standard in these kinds of cases.
I’m better now.
Googling for "scurvy low mood", I find plenty of sources that indicate that scurvy is accompanied by "mood swings — often irritability and depression". IIRC, this has remarked upon for at least two hundred years.
You write "This residual stream fraction data seems like evidence of something. We just don't know how to put together the clues yet." I am happy to say that there is a simple explanation-- simple, at least, to those of us experienced in high-dimensional geometry. Weirdly, in spaces of high dimension, almost all vectors are almost at right angles. Your activation space has 1600 dimensions. Two randomly selected vectors in this space have an angle of between 82 and 98 degrees, 99% of the time. It's perfectly feasible for this space to represent zillions of concepts almost at right angles to each other. This permits mixtures of those concepts to be represented as linear combinations of the vectors, without the base concepts becoming too confused.
Now, consider a random vector, w (for 'wedding'). Set 800 of the coordinates of w to 0, producing w'. The angle between w and w' will be 60 degrees. This is much closer than any randomly chosen non-wedding concept. This is why a substantial truncation of the wedding vector is still closer to wedding than it is to anything else.
Epistemic status: Medium strong. High-dimensional geometry is one of the things I do for my career. But I did all the calculations in my head, so there's a 20% of my being quantitatively wrong. You can check my claims with a little algebra.