Relating HCH and Logical Induction

by abramdemski4 min read16th Jun 20204 comments

49

Ω 22

Logical InductionHumans Consulting HCHLogical UncertaintyAI
Frontpage
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I'd like to communicate a simple model of the relationship between logical induction and HCH which I've known about for some time. This is more or less a combination of ideas from Sam, Tsvi, and Scott, but I don't know that any of them endorse the full analogy as I'll state it.

HCH as Bounded Reflective Oracles

An HCH is a human consulting the computational power of an HCH. It's very natural to model this within a reflective oracle setting, so that we can talk about computations which have oracle access to the output of any other computation. However, reflective oracles (ROs) are too powerful to capture a realistic HCH: access to an RO is tantamount to logical omniscience, since you can ask the RO the output of any computation, no matter how large.

Instead, we can think of HCH as a human with access to a bounded reflective oracle. A bounded reflective oracle (BRO) limits the time spent by the Turing machine (similar to how HCH requires the human to return an answer within a set time period, such as an hour, exporting any additional computation required to answer the question to the HCH calls the human makes in that time), plus, limits the size of the call trees which can be created as a consequence of the recursive calls an oracle machine makes (similar to how some versions of HCH give the human a limited number of recursive calls, which the human must then allocate between the HCH calls which the human makes, bounding the total size of the HCH call tree).

So, we can think of HCH and BRO as interchangeable: an HCH is just a BRO computation which starts with a simulation of a human with access to a BRO, and with the property that the human only ever makes calls to simulations of themselves-with-access-to-BRO, recursively. Similarly, a BRO-machine can be computed within an HCH if we have the human carry out the steps of the BRO-machine. Whenever the BRO machine makes a BRO call, the human in the HCH makes a corresponding HCH call, asking another human to simulate another BRO machine and report back with the results.

BROs Learning to Think

Bounded Oracle Induction (BOI) is a variant of logical induction based on BROs. Standard logical induction constructs market prices each day by finding the equilibrium prices, where all bets are balanced by opposite bets (or every "buy" has a corresponding "sell"). BOI uses BROs to find this equilibrium, so we can think of traders as probability distributions which can be computed via oracle access to the market (just as the market is something we can compute given oracle access to the traders).

Think of it this way. We want to use a BRO to answer questions. We know it's very powerful, but at first, we don't have a clue as to how to answer questions with it. So we implement a Bayesian mixture-of-experts, which we call the "market". Each "trader" is a question-answering strategy: a way to use the BRO to answer questions. We give each possible strategy for using the BRO some weight. However, our "market" is itself a BRO computation. So, each trader has access to the market itself (in addition to many other computations which the BRO can access for them). Some traders may mostly trust the market, providing only small adjustments to the market answer. Other traders may attempt to provide their own answers entirely from scratch, without asking the market.

Obviously, our initial market won't be very informative; it's just an arbitrary collection of traders. But we can get feedback, telling us how well we did on some of the questions we tried to answer. We use the logical induction algorithm (the LIA) to update the weights of each trader. This procedure has the advantage of satisfying the logical induction criterion (LIC): the market beliefs over time will not be too exploitable.

So, a BOI is someone using this strategy to learn to use a BRO. Like HCH, a BOI on any one market day gives its opinion with-access-to-our-own-opinion: to answer any given question, the BOI can ask itself a series of other questions.

Unlike HCH, a BOI has a concept of learning to give better answers over time. An HCH is an amplified version of a fixed person at a fixed moment in time. It does not allow that person to learn better question-answering strategies.

In this view, an HCH is a freeze-frame of the logical-induction deliberation process. All the recursive calls of an HCH, building exponential-sized trees of cognitive labor, is considered "one cognitive moment" in logical induction terms.

Notions of Amplification

HCH gives us a notion of amplification assuming black-box access to an agent we want to amplify. Assuming we can steal the human question-answering strategy, HCH gives us a notion of much-better-thought-out answers to our questions. HCH does not rely on any formal notion of rationality, but assumes that the human question-answering strategy is competent in some sense, so that the amplified human which HCH gives us is highly capable.

Logical induction gives us a very different notion of amplification. The LIC is a rationality notion, telling us something about what it is to form better beliefs as we think longer. Assuming some things about the structure of an agent's beliefs, we can run the agent's thinking process forward to get an amplified version of it: what it would think if it considered every question more thoroughly. Unlike HCH, this would require much more than black-box access to the question-answering strategy, however. I don't (currently) know how to take a human and define their amplification within the BOI framework (or LIC/LIA).

It is a theorem that a logical inductor trusts its future opinions more than its present opinions. This gives a formal notion of trust in amplification. We don't get similar reassurances within an HCH framework.

Furthermore, LIC gives us a notion of trust in a process. A logical inductor trusts each successive market state more. The process of logical induction doesn't have any nice termination point, but, it does the best it can with its bounded resources at each step (in a certain sense provided by LIC).

HCH gives us an idealized fixed-point, rather than a notion of successive improvement. "HCH consulting HCH" is just HCH. However, we can try to climb up to that fixed-point by iterating: human-consulting-human, human-consulting-human-consulting-human, ... Certainly the hope is to get some kind of trust-in-the-process argument which works under plausible assumptions.

Each and every market day in a BOI is already at the HCH fixed-point of rationality under self-consultation. This makes direct comparisons between the two notions of amplification trickier. One interpretation is that the LIC notion is a notion of amplification for HCH fixed points: once you've bootstrapped up to HCH, how do you go further? Logical induction gives a theory of what it means to get better at that point, and if we obey its rationality notion, we get some nice guarantees about self-trust and endorsing the process of improvement.

Why do we need a notion of further-amplifying-beyond-HCH? Because being the human in HCH is hard: you don't automatically know what strategy to use to answer every question, and there's a significant sense in which you can learn to be better.

Applications?

I don't know if this analogy between HCH and logical induction is useful for anything. It would be interesting to see a variant of IDA which didn't just approximate an HCH fixed-point, instead somehow approximating the way a BOI learns to use BROs more effectively over time. It would be very interesting if some assumptions about the human (EG, the assumption that human deliberation eventually notices and rectifies any efficiently computable Dutch-book of the HCH) cound guarantee trust properties for the combined notion of amplification, along the lines of the self-trust properties of logical induction.

More broadly, it would be really nice if "the logical induction notion of amplification" I've outlined here could be turned into a real notion of amplification in the sense of HCH -- taking information about a human and using it to define an amplified human. (And not just "simulate the human thinking longer".)

49

Ω 22