What are the causality effects of an agents presence in a reinforcement learning environment

Jonas Kgomo

[ Question ]

What are the causality effects of an agents presence in a reinforcement learning environment

1 min read1st Mar 20221 answer 1 comment

0

Suppose we have an agent A trying to optimise for a reward R in an environment S.
How can we tell that the presence of the agent does not affect the environment and the measurement(observation) is not only subject to the agent but the environment?

This is related with the measurement problem in quantum computing, we have an agent (a particle ) entangled in a quantum superposition, consider an electron with two possible configurations: up and down, , when we measure the state, it collapses the wavefunction to a particular classical state, up or down.

Moreover, the observer effect, notes that measurements of certain systems cannot be made without affecting the system.
While the uncertainty principle argues that we cannot predict the value of a quantity with arbitrary certainty.

Another way to state the problem is, does measuring the state of the action affect the state of the environment?
How are the observations in physics different from the observations we make in RL?
Is the environment state at $S_{1}$ causal to the state in $S_{2}$ ?

New to LessWrong?

Getting Started

FAQ

Library

What are the causality effects of an agents presence in a reinforcement learning environment

1st Mar 2022

0TLW

3Charlie Steiner

New Answer

New Comment

1 Answers sorted by
top scoring

TLW

Mar 02, 2022

Simply asking "does it affect the environment" is not enough here.

Leaking, say, 1pJ of thermal energy to the surroundings in a slightly different manner in the two cases technically affects the surroundings quite significantly - thermal motion in a gas is chaotic after all - but in practice we would tend to call this minimal^[1] effects on the environment.

^{^}
Assuming a human-livable environment at least. Obviously this might be different if e.g. your environment is all at 2 picoKelvin or somesuch.

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 1:19 PM

[-]Charlie Steiner2y30

You might also be interested in the question of whether the costs of the agent's own thinking can be included in an RL environment. Suppose I have a finite electricity budget and thinking harder will use electricity. It seems like I as a human am flexible enough to adjust my thinking style to some degree in response to constraints like this, but whap happens yo typical RL agents if they're given negative reward for running out of electricity?

Moderation Log

[ Question ]

What are the causality effects of an agents presence in a reinforcement learning environment

0

New to LessWrong?

1 Answers sorted by top scoring

Mar 02, 2022

1 Answers sorted by
top scoring