The concept is an abstraction.

*Yes, it is. The fact that it is an abstraction is precisely why it breaks down under certain circumstances.

An I/O channel doesn't imply modern computer technology. It just means information is collected from or imprinted upon the environment. It could be ant pheromones, it could be smoke signals, its physical implementation is secondary to the abstract concept of sending and receiving information of some kind. You're not seeing the forest through the trees. Information most certainly does exist.

The claim is not that "information" does not exist. The claim is that input/output channels are in fact an abstraction over more fundamental physical configurations. Nothing you wrote contradicts this, so the fact that you seem to think what I wrote was somehow incorrect is puzzling.

I've explained in previous posts that AIXI is a special case of AIXI_lt. AIXI_lt can be conceived of in an embedded context,

Yes.

in which case; its model of the world would include a model of itself which is subject to any sort of environmental disturbance

*No. AIXI-tl explicitly does not model itself or seek to identify itself with any part of the Turing machines in its hypothesis space. The very concept of self-modeling is entirely absent from AIXI's definition, and AIXI-tl, being a variant of AIXI, does not include said concept either.

To some extent, an agent must trust its own operation to be correct, because you quickly run into infinite regression if the agent is modeling all the possible that it could be malfunctioning. What if the malfunction effects the way it models the possible ways it could malfunction? It should model all the ways a malfunction could disrupt how it models all the ways it could malfunction, right? It's like saying "well the agent could malfunction, so it should be aware that it can malfunction so that it never malfunctions". If the thing malfunctions, it malfunctions, it's as simple as that.

*This is correct, so far as it goes, but what you neglect to mention is that AIXI makes no attempt to preserve its own hardware. It's not just a matter of "malfunctioning"; humans can "malfunction" as well. However, the difference between humans and AIXI is that we understand what it means to die, and go out of our way to make sure our bodies are not put in undue danger. Meanwhile, AIXI will happily allows its hardware to be destroyed in exchange for the tiniest increase in reward. I don't think I'm being unfair when I suggest that this behavior is extremely unnatural, and is not the kind of thing most people intuitively have in mind when they talk about "intelligence".

Aside from that, AIXI is meant to be a purely mathematical formalization, not a physical implementation. It's an abstraction by design. It's meant to be used as a mathematical tool for understanding intelligence.

*Abstractions are useful for their intended purpose, nothing more. AIXI was formulated as an attempt to describe an extremely powerful agent, perhaps the most powerful agent possible, and it serves that purpose admirably so long as we restrict analysis to problems in which the agent and the environment can be cleanly separated. As soon as that restriction is removed, however, it's obvious that the AIXI formalism fails to capture various intuitively desirable behaviors (e.g. self-preservation, as discussed above). As a tool for reasoning about agents in the real world, therefore, AIXI is of limited usefulness. I'm not sure why you find this idea objectionable; surely you understand that all abstractions have their limits?

Do you consider how the 30 Watts leaking out of your head might effect your plans to every day? I mean, it might cause a typhoon in Timbuktu! If you don't consider how the waste heat produced by your mental processes effect your environment while making long or short-term plans, you must not be a real intelligent agent...

Indeed, you are correct that waste heat is not much of a factor when it comes to humans. However, that does not mean that the same holds true for advanced agents running on powerful hardware, especially if such agents are interacting with each other; who knows what can be deduced from various side outputs, if a superintelligence is doing the deducing? Regardless of the answer, however, one thing is clear: AIXI does not care.

This seems to address the majority of your points, and the last few paragraphs of your comment seem mainly to be reiterating/elaborating on those points. As such, I'll refrain from replying in detail to everything else, in order not to make this comment longer than it already is. If you respond to me, you needn't feel obligated to reply to every individual point I made, either. I marked what I view as the most important points of disagreement with an asterisk*, so if you're short on time, feel free to respond only to those.

Decision Theory

by abramdemski, Scott Garrabrant 1 min read31st Oct 201837 comments

101

Ω 24


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here.)

The next post in this sequence, 'Embedded Agency', will come out on Friday, November 2nd.

Tomorrow’s AI Alignment Forum sequences post will be 'What is Ambitious Value Learning?' in the sequence 'Value Learning'.