Thanks for engaging with my post. From my perspective you seem simply very optimistic on what kind of data can be extracted from unspecific measurements. Here is another good example on how Eliezer makes some pretty out there claims about what might be possible to infer from very little data: https://www.lesswrong.com/posts/ALsuxpdqeTXwgEJeZ/could-a-superintelligence-deduce-general-relativity-from-a -- I wonder what your intuition says about this?
But maybe your intuitions are wrong (or maybe both). I think a desirable property of plans/strategies for alignment would be robustness to either of us being wrong about this 🙂
Generally it is a good idea to be robust with plans. However, in this specific instance, the way Eliezer phrases it, any iterative plan for alignment would be excluded. Since I also believe that this is the only realistic plan (there will simply never be a design that has the properties that Eliezer thinks guarantee alignment), the only realistic remaining path would be a permanent freeze (which I actually believe comes with large risks as well: unenforcability and thus worse actors making ASI first, biotech in the wrong hands becoming a larger threat to humanity, etc.).
What I would agree to is that it is good to plan for the eventuality that a lot less data could be needed by an ASI to do something like "create nanobots". For example, we could conclude that it's for now simply a bad idea if AI is used in biotech labs, because these are the places where it could easily gather a lot of data and maybe even influence experiments so that they let it learn the things it needs to create nanobots. Similarly, we could try to create a worldwide warning systems around technologies that seem likely to be necessary for an AI takeover, and watch these closely, so that we would notice any specific experiments. However, there is no way to scale this to a one-shot scenario.
Eliezer's scenario does assume the involvement of human labs (he describes a scenario where DNA is ordered online).
His claim is that an ASI will order some DNA and get some scientists in a lab to mix it together with some substances and create nanobots. That is what I describe as a one-shot scenario. Even if it were 10,000 shots in parallel I simply don't think it is possible, because I don't think the data itself is out there. Similarly to how you need accelerators to work out how physical laws work in high energy regimes (and random noise from other measurements just tells you nothing about it), if you are planning to design a completely new type of molecular machinery then you will need to do measurements on those specific molecules. So there will need to be a feedback loop, where the AI can learn detailed outcomes from experiments to gain more data.
I agree with you here (although I would hope that much of this iteration can be done in quick succession, and hopefully in a low-risk way) 🙂
It's great that we agree on this :) And I do agree on finding ways to make this lower risk, and I think taking into account few-shot learning scenarios on biotech would be a good idea. And don't get me wrong -- there may be biotech scenarios with very few shots that kill a lot of humans available today, probably even without any ASI (humans can do it). I just think if an AI executed it today it would have no way of surviving and expanding.
Engineers, however, can constrain and master this sort of unpredictability. A pipe carrying turbulent water is unpredictable inside (despite being like a shielded box), yet can deliver water reliably through a faucet downstream. The details of this turbulent flow are beyond prediction, yet everything about the flow is bounded in magnitude, and in a robust engineering design the unpredictable details won’t matter.
This is absolutely what engineers do. But finding the right design patterns that do this involves a lot of experimentation (not for a pipe, but for constructing e.g. a reliable transistor). If someone eventually constructs non-biological self-replicating nanobots, it will probably involve high-reliability design patterns around certain molecular machinery. However, finding the right molecules that reliably do what you want, as well as how to put them together, etc., is a lot of research that I am pretty certain will involve actually producing those molecules and doing experiments with them.
That protein folding is "solved" does not disprove this IMO. Biological molecules are, after all, made from simple building blocks (amino acid) with some very predictable properties (how they stick together) so it's already vastly simplified the problem. And solving protein folding (as far as I know) does not solve the question of understanding what molecules actually do -- I believe understanding protein function is still vastly less developed (correct me if I'm wrong here, I haven't followed it in detail).
I think 1-4 are good summaries of the arguments I'm making about nanobots. I would add another point that the reason it is hard to make nanobots is not about a lack of computational abilities (although that could also be a bottleneck) but simply a lack of knowledge about the physical world that can only be resolved by learning more about the physical world in a way that is relevant to making nanobots.
On point 5, from my current perspective, I think the idea of pivotal acts is totalitarian, not a good idea and most likely to screw things up if ever attempted. So I wasn't mainly trying to make a statement about them here (that would be another post). I was making a side argument about them that is roughly summarized in 5 -- giving an AI full physical capabilities seems like a very dangerous step and if it is part of your plan for a pivotal act you should be especially worried that you are making things worse.