Can We Do Without Bridge Hypotheses?

Rob Bensinger

Followup to: Building Phenomenological Bridges, Reductionism

Bridge hypotheses are extremely awkward. It's risky to draw permanent artificial lines between categories of hypothesis ('physical' vs. 'bridge'). We might not give the right complexity penalties to one kind of hypothesis relative to the other. Or we might implement a sensible framework for bridge hypotheses in one kind of brain that fails to predict the radically new phenomenology that results from expanding one's visual cortex onto new hardware.

We'd have to hope that it makes sense to talk about 'correct' bridging rules (correctly relating a hypothesis about external stimuli or about transistors composing yourself, to which settings are in fact the ones you call 'green'), even though they're quite different from ordinary physical descriptions of the world. And, since fully general and error-free knowledge of the phenomenologies of possible agents will probably not be available to a seed AGI or to its programmers, we'd have to hope that it's possible to build a self-modifying inductor robust enough that mistaken bridge predictions would just result in a quick Bayesian update towards better ideas. It's definitely a dangling thread.

Why, then, can't we do without them? Maybe they're a handy heuristic for agents with incomplete knowledge — but can they truly never be eliminated?

The notion of an irreducible divide between an AI's subjective sensations and its models of the objective world may sound suspiciously dualistic. If we live in a purely physical world, then why shouldn't a purely physical agent, once it’s come to a complete understanding of itself and the world, be able to dispense with explicit bridges? These are, after all, the agent's beliefs that we're talking about. In the limit, intuitively, accurate beliefs should just look like the world. So shouldn't the agent's phenomenological self-models eventually end up collapsing into its physical world-models — dispensing with a metaphysically basic self/world distinction?¹

Yes and no. When humans first began hypothesizing about the relationship between mind and matter, the former domain did not appear to be reducible to the latter. A number of philosophers concluded from this that there was a deep metaphysical divide between the two. But as the sciences of mind began to erode that belief in mind-matter dualism, they didn't eliminate the conceptual, linguistic, or intuitive distinctness of our mental and physical models. It may well be that we'll never abandon an intentional stance toward many phenomena, even once we've fully reduced them to their physical, biological, or computational underpinnings. Models of different levels can remain useful even once we've recognized that they co-refer.

In the case of an artificial scientist, beliefs in a fundamental sensation-v.-world dichotomy may dissolve even if the agent retains a useful conceptual distinction between its perceptual stream and the rest of the world. A lawful, unified physics need not be best modeled by agents with only a single world-modeling subprocess. 'There is one universe' doesn't imply 'one eye is optimal for viewing the universe'; 'there is one Earth' doesn't imply 'one leg is optimal for walking it'. The cases seem different chiefly because the leg/ground distinction is easier for humans to keep straight than the map/territory distinction.

Empirical reasoning requires a representational process that produces updates, and another representational process that gets updated. Eliminate the latter, and gone is the AI’s memory and expectation. (Imagine Cai experiencing its sequence of colors forever without considering any states of affairs they predict.) Eliminate the former, and the AGI has nothing but its frozen memories. (Imagine Cai without any sensory input, just a floating array of static world-models.) Keep both and eliminate bridging, and Cai painstakingly collects its visual data only to throw it all away; it has beliefs, but it never updates them.

Can we replace perceptions and expectations with a single kind-of-perceptiony kind-of-expectationish epistemic process, in a way that obviates any need for bridge hypotheses?

Maybe, but I don't know what that would look like. An agent's perceptions and its hypotheses are of different types, just by virtue of having distinct functions; and its meta-representations must portray them as such, lest its metacognitive reasoning fall into systemic error. Striving mightily to conflate the two may not make any more sense than striving to get an agent to smell colors or taste sounds.²

The only candidate I know of for a framework that may sidestep this distinction without thereby catching fire is Updateless Decision Theory, which was brought up by Jim Babcock, Vladimir Slepnev, and Wei Dei. UDT eliminates the need for bridge hypotheses in a particularly bold way, by doing away with updatable hypotheses altogether.

I don't understand UDT well enough to say how it bears on the problem of naturalizing induction, but I may return to this point when I have a better grasp on it. If UDT turns out to solve or dissolve the problem, it will be especially useful to have on hand a particular reductionism-related problem that afflicts other kinds of agents and is solved by UDT. This will be valuable even if UDT has other features that are undesirable enough to force us to come up with alternative solutions to naturalized induction.

For now, I'll just make a general point: It's usually good policy for an AGI to think like reality; but if an introspectible distinction between updatable information and update-causing information is useful for real-world inductors, then we shouldn't strip all traces of it from artificial reasoners, for much the same reason we shouldn't reduce our sensory apparatuses to a single modality in an attempt to ape the unity of our world's dynamics. Reductionism restricts what we can rationally believe about the territory, but it doesn't restrict the idiom of our maps.

¹ This is close to the worry Alex Flint raised, though our main concern is with the agent's ability to reduce its own mental types, since this is a less avoidable problem than a third party trying to do the same.

² The analogy to sensory modality is especially apt given that phenomenological bridge hypotheses can link sensory channels instead of linking a sensory channel to a hypothesized physical state. For instance, 'I see yellow whenever I taste isoamyl acetate' can function as a bridge between sensations an agent types as 'vision' and sensations an agent types as 'taste'.

Could the “correct” bridge hypothesis change if part of the agent is destroyed, or, if not, would it require a more complex bridge hypothesis (that is never verified in practice)?

For an agent that can die or become fully unconscious, a complete and accurate bridge hypothesis should include conditions under which a physical state of the world corresponds to the absence of any introspection or data. I'll talk about a problem along these lines for AIXI in my next post.

It's similar to a physical hypothesis. You might update the hypothesis when you learn something new about death, but you of course can't update after dying, so any correct physical or mental or bridging belief about death will have to be prospective.

Is it supposed to be possible to define a single “correct” bridge mapping for some other agent than self?

I'm not sure about the 'single correct' part, but yes, you can have hypotheses about the link between an experience in another agent and the physical world. In some cases it may be hard to decide whether you're hypothesizing about a different agent's phenomenology, or about the phenomenology of a future self.

You can also hypothesize about the link between unconscious computational states and physical states, in yourself or others. For instance, in humans we seem to be able to have beliefs even when we aren't experiencing having them. So a fully general hypothesis linking human belief to physics wouldn't be a 'phenomenological bridge hypothesis'. But it might still be a 'computational bridge hypothesis' or a 'functional bridge hypothesis'.

Is the location of the agent in a world a part of the bridge hypothesis or a given?

I'll talk about this a few posts down the line. Indexical knowledge (including anthropics) doesn't seem to be a solved problem yet.

The formalism I introduced in http://lesswrong.com/lw/h4x/intelligence_metrics_and_decision_theories/ studies intelligence w/o explicit division of epistemic hypotheses into "universe" and "bridge". This is achieved by starting with a certain "innate model" of reality which already contains bridging rules. All epistemic hypothesis are "mapped" onto the innate model in some sense.

Bridge hypotheses don't seem to me to be the kind of thing one ought to even try to get rid of. The idea of having a physical layer is a powerful one, and you'll need to bridge from it to your perceptions.

AIXI ignores models of the world and only knows about perceptions. There is an analogous agent who only "knows" about the world. For this agent, every perception is represented as a change in the model of the world.

Could the “correct” bridge hypothesis change if part of the agent is destroyed, or, if not, would it require a more complex bridge hypothesis (that is never verified in practice)?

Is it supposed to be possible to define a single “correct” bridge mapping for some other agent than self?

Is the location of the agent in a world a part of the bridge hypothesis or a given?

I'll talk about this a few posts down the line. Indexical knowledge (including anthropics) doesn't seem to be a solved problem yet.

16

Can We Do Without Bridge Hypotheses?

16

16

16