LESSWRONG
LW

1638
KanHar
6010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Models Don't "Get Reward"
KanHar3y70

Considering the fact that weight sharing between actor and critic networks is a common practice, and given the fact that the critic passes gradients (learned from the reward) to the actor, for most practical purposes the actor gets all of the information it needs about the reward.

This is the case for many common architectures.

Reply
No posts to display.