LESSWRONG
LW

484
Khoth_duplicate0.7536578752539433
2020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
Humans can be assigned any values whatsoever...
Khoth_duplicate0.75365787525394338y20

I think there are two ways that a reward function can be applicable:

1) For making moral judgements about how you should treat your agent. Probably irrelevant for your button presser unless you're a panpsychist.

2) If the way your agent works is by predicting the consequences of its actions and attempting to pick an action that maximises some reward (eg a chess computer trying to maximise its board valuation function). Your agent H as described doesn't work this way, although as you note there are agents which do act this way and produce the same behaviour as your H.

There's also the kind-of option:

3) Anything can be modelled as if it had a utility function, in the same way that any solar system can be modelled as a geocentric one with enough epicycles. In this case there's no "true" reward function, just "the reward function that makes the maths I want to do as easy as possible". Which one that is depends on what you're trying to do, and maybe pretending there's a reward function isn't actually better than using H's true non-reward-based algorithm.

Reply
Open thread, September 25 - October 1, 2017
Khoth_duplicate0.75365787525394338y00

Fire Emblem: Heroes

Reply