Akira Pyinya - LessWrong

ACI#8: Value as a Function of Possible Worlds

Thank you for your introduction of Richard Jeffery's theory! I just read some article about his system and I think it's great. I think his utility theory built upon proposition is just what I want to describe. However, his theory still starts from given preferences without showing how we can get these preferences (although these preferences should satisfy certain conditions), and my article argues that these preferences cannot be estimated using the Monte Carlo method.

Actually, ACI is an approach that can assign utility (preferences) to every proposition, by estimate its probability of "being the same as example of right things". In other words, as long as we have examples of doing the right things, we can estimate the utility of any proposition using algorithmic information theory. And that's actually how organisms learn from evolutionary history.

I temporarily call this approach Algorithmic Common Intelligence (ACI) because its mechanism is similar to the common law system. I am still refining this system from reading more other theories and writing programs based on it, that's why I think my old articles about ACI may contain many errors.

Again, thank you for your comment! Hope you can give me more advices.

ACI#6: A Non-Dualistic ACI Model

Akira Pyinya8mo10

Thank you for your reply!

"The self in 10 minutes" is a good example of revealing the difference between ACI and the traditional rational intelligence model. In the rational model, the input information is send to atom-like agent, where decisions are made based on the input.

But ACI believes that's not how real-world agents work. An agent is a complex system made up with many different parts and levels: the heart receives mechanical, chemical, and electronic information from its past self and continue beating, but with different heart rates because of some outside reasons; a cell keeps running its metabolic and functional process, which is determined by its past situation, and affected by its neighbors and chemicals in the blood; finally, the brain outputs neural signals based on its past state and new sensory information. In other words, the brain has mutual information with its past self, the body, and the outer world, but that's only a small part of the mutual information between my present self and me in 10 minutes.

In other words, the brain uses only a tiny part of the information an agent uses. furthermore, when we talk about awareness, I am aware of only a tiny part of the information process in my brain.

An agent is not like an atom, but an onion with many layers. Decisions are made in parallel in these layers, and we are aware of only a small part of the layers. It's even not possible to draw a solid boundary between awareness and no awareness.

The second question, a stable object may have high mutual information at different times, but may also have high mutual information with other agents. For example, a rock may be stable in size and shape, but its position and movement may highly depends on outside natural force and human behavior. However, the definition of agency is more complex than this, I will try to discuss it in the future posts.

(A Failed Approach) From Precedent to Utility Function

Akira Pyinya2y10

Yes, of course some species extinct, but that's why organisms today do not have their genes (containing information about how to live). On the contrary, every ancestor of a living organism did not die before they reproduce.

Beyond Rewards and Values: A Non-dualistic Approach to Universal Intelligence

Akira Pyinya2y10

I think I have already responded to that part. Who is the “caretaker that will decide what, when and how to teach the ACI”? The answer is natural selection or artificial selection, which work like filters. AIXI’s “constructive, normative aspect of intelligence is ‘assumed away’ to the external entity that assigns rewards to different outcomes”, while ACI’s constructive, normative aspect of intelligence is also assumed away to the environment that have determined which behavior was OK and which behavior would get a possible ancestor out of the gene pool. Since the the reward circuit of natural intelligence is shaped by natural selection, ACI is also eligible to be a universal model of intelligence.

Thank you for your correction about Active Inference reading, I will read more then respond to that.

Beyond Rewards and Values: A Non-dualistic Approach to Universal Intelligence

Akira Pyinya2y10

Thank you for your comment. I have spent some time reading the book Active Inference. I think active inference is a great theory, but focuses on some aspects of intelligence different from what ACI does.

ACI learns to behave the same way as examples, so it can also learn ethics from examples. For example, if behaviors like “getting into a very cold environment” is excluded from all the examples, either by natural selection or artificial selection, an ACI agent can learn ethics like “always getting away from cold”, and use it in the future. If you want to achieve new ethics, you have to either induce from the old ones or learn from selection in something like “supervised stages”.

Unlike Active Inference or the “AIXI+ValueLearning” combination, ACI does not divide the learning process into “information gain” and “pragmatic value learning”, but learns them as a whole. For example, bacterias can learn policies like following nutrient gradients from successful behaviors proved by natural selection.

The problem of dividing the learning process is that, without value learning, we don’t know what information is important and need to learn, but without enough information it would be difficult to understand values. That’s why active inference indicates that “pragmatic and epistemic values need to be pursued in tandem”. However, the ACI model works in a little different way, it induces and updates policies directly from examples, and practices epistemic learning only when the policy asks to, such as when the policy involves pursuing some goal states.

In the active inference model, both information gain and action are considered as “minimizing the discrepancy between our model and our world through perception and action”. For example, when a person senses his body temperature is much higher than expected, he should change his model of body temperature, or take action to lower his body temperature. He always chooses the latter, because “we are considerably more confident about our core temperature because it underwrites our existence”. In the words of ACI, that is “from experience (including genetic information), we know that as long as we (and our homeotherm ancestors) are alive, we always act to keep our core temperature stay the same”.

In order to have a better description of this body temperature example, I suggest a small improvement on the active inference model: “minimizing the discrepancy between our model and our world”, including minimizing the discrepancy between the model of our action and our actual action. In this body temperature example, the person has choices of acting to lower his temperature or not to act, the former action can minimize the discrepancy between the model of our action (always keep our core temperature the same) and our actual action.

Why Free Will is NOT an illusion

Akira Pyinya2y10

In the post I was just trying to describe the internal unpredictability in a deterministic universe, so I think I have already made a distinction between predictability and determinism. The main disagreement between us is that which one is more related to free will. Thank you for pointing out this, I will focus on this topic in the next post.

Why Free Will is NOT an illusion

Akira Pyinya2y10

I want to define "degree of free will" like: for a given observer B, what is the lower limit of event A's unpredictability. This observer does not have to be human, it can be an intelligence agent with infinite computing ability. It just does not have enough information to make prediction. The degree of free will could be very low for an event very close to the observer, but unable to ignore when controlling a mars rover. I don't know if anybody has ever described this quantity (like the opposite of Markov Blanket?), if you know please tell me.

I like your distinction between "experiential free will" and "deterministic free will". As for "experiential free will", maybe we can focus more on the definition of "reality" and "illusion". (I am still working on it)

Why Free Will is NOT an illusion

Akira Pyinya2y10

You are right, this picture only works in an infinite universe.

Why Free Will is NOT an illusion

Akira Pyinya2y20

Thank you for your comment. Actually I am trying to build a practical and quantitative model of free will instead of just say free will is or is not an illusion, but I can't find a better way to define free will in a practical way. That's why I introduce an "observer" which can make prediction.

And I agree with you, claims like "not 100% correctly" are too weak. But possibly we can define some functions like "degree of free will", representing how much one subject could be predicted or controlled. I'm not sure if this definition resembles the common meaning of "free will", but it might be somewhat useful.

Why Free Will is NOT an illusion

Akira Pyinya2y10

Thank you for your comment, but it would be appreciated if you could prove my conclusion is wrong (e.g. either observer B1 or B2 is able to know or predict event C)

LESSWRONG
LW

Posts

Wiki Contributions

Comments