This seems to me the opposite of a low bandwidth recursion. Having access the the entire context window of the previous iteration minus the first token, it should be pretty obvious that most of the relevant information encoded by the values of the nodes in that iteration could in principal be reconstructed, excepting the unlikely event that first token turns out to be extremely important. And it would be pretty weird if much if that information wasn't actually reconstructed in some sense in the current iteration. An inefficient way to get information from one iteration to the next, if that is your only goal, but plausibly very high bandwidth.
I'm surprised you were only able to predict whether you'd taken caffeine 80% of the time. 200 mg is not a heroic dose, but on little to no tolerance it should be quite noticeable.
And of course if we believe efficiency is the way to go for the next few years, that should scare the shit out of us, it means that even putting all gpu manufacturers out of commission might not be enough to save us should it become obvious that a slowdown is needed
Maybe temporarily efficiency improvements will rule, but surely once the low and medium hanging fruit is exhausted parameter count will once again be ramped up, would bet just about anything on that
However good an idea it is, it's not as good an idea as Aaronson just taking a year off and doing it on his own time, collaborating and sharing whatever he deems appropriate with the greater community. Might be financially inconvenient but is definitely something he could swing.
How do we deal with institutions that don't want to be governed, say idk the Chevron corporation, North Korea, or the US military?
Well I don't think it should be possible to convince a reasonable person at this point in time. But maybe some evidence that we might not be doomed. Yudkowsky and other's ideas rest on some fairly plausible but complex assumptions. You'll notice in the recent debate threads where Eliezer is arguing for inevitability of AI destroying us he will often resort to something like, "well that just doesn't fit with what I know about intelligences". At a certain point in these types of discussions you have to do some hand waving. Even if it's really good hand waving, if there's enough if it there's a chance at least one piece is wrong enough to corrupt your conclusions. On the other hand, as he points out, we're not even really trying, and it's hard to see us doing so in time. So the hope that's left is mostly that the problem just won't be an issue or won't be that hard for some unknown reason. I actually think this is sort of likely, given how difficult it is to analyze, it's hard to have full trust in any conclusion.
While we're sitting around waiting for revolutionary imaging technology or whatever, why not try and make progress on the question of how much and what type of information can we obscure about a neural network and still approximately infer meaningful details of that network from behavior. For practice, start with ANNs and keep it simple. Take a smallish network which does something useful, record the outputs as it's doing its thing, then add just enough random noise to the parameters that output deviates noticeably from the original. Now train the perturbed version to match recorded data. What do we get here, did we recover the weights and biases almost exactly? Assuming yes, how far can this go before we might as well have trained the thing from scratch? Assuming success, does it work equally on different types and sizes of networks, if not what kind of scaling laws does this process obey? Assuming some level of success, move on to a harder problem, a sparse network, this time we throw away everything but connectivity information and try to repeat the above. How about something biologically realistic but we try to simulate the spiking neurons with groups of standard artificial ones.. you get the drift.
This is outright saying ETH is likely to outperform BTC, so this is Scott’s biggest f*** you to the efficient market hypothesis yet. I’m going to say he’s wrong and sell to 55%, since it’s currently 0.046, and if it was real I’d consider hedging with ETH.
I'm curious what's behind this, is Zvi some sort of bitcoin maximalist? I tend to think that bitcoin having a high value is hard to explain, it made sense when it was the only secure cryptocurrency out there but now it's to a large degree a consequence of social forces rather than economic ones. Ether I can see value in, since it does a bunch of things and there's at least an argument that it's best in class for all those.
We still have a hard problem since misuse of AI, for example using it to secure permanent control over the world, would be extremely tempting. Under this assumption outcomes where everyone doesn't die but which are as bad or worse are much more likely than they would be under its negation. I think the answer to avoiding non awful futures looks similar, we agree globally to slow down before the tech could plausibly pose a big risk, probably that means right around yesterday. Except instead of just using the extra time to do scientific research we also make the appropriate changes to our societies/governments.