In Can HCH epistemically dominate Ramanujan? Alex Zhu wrote:
If HCH is ascription universal, then it should be able to epistemically dominate an AI theorem-prover that reasons similarly to how Ramanujan reasoned. But I don’t currently have any intuitions as to why explicit verbal breakdowns of reasoning should be able to replicate the intuitions that generated Ramanujan’s results (or any style of reasoning employed by any mathematician since Ramanujan, for that matter).
And I answered:
My guess is that HCH has to reverse engineer the theorem prover, figure out how/why it works, and then reproduce the same kind of reasoning.
And then I followed up my own comment with:
It occurs to me that if the overseer understands everything that the ML model (that it’s training) is doing, and the training is via some kind of local optimization algorithm like gradient descent, the overseer is essentially manually programming the ML model by gradually nudging it from some initial (e.g., random) point in configuration space.
No one answered my comments with either a confirmation or denial, as to whether these guesses of how to understand Universality / Informed Oversight and IDA are correct. I'm surfacing this question as a top-level post because if "Informed Oversight = reverse engineering" and "IDA = programming by nudging" are good analogies for understanding Informed Oversight and IDA, it seems to have pretty significant implications.
In particular it seems to imply that there's not much hope for IDA to be competitive with ML-in-general, because if IDA is analogous to a highly constrained method of "manual" programming, that seems unlikely to be competitive with less constrained methods of "manual" programming (i.e., AIs designing and programming more advanced AIs in more general ways, similar to how humans do most programming today), which itself is presumably not competitive with general (unconstrained-by-safety) ML (otherwise ML would not be the competitive benchmark).
If these are not good ways to understand IO and IDA, can someone please point out why?