LESSWRONG
LW

55
Edward Guo
6010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
EfficientZero: How It Works
Edward Guo4y70

I would be careful using reinforcement learning to check for theoretical maximization of training data, given that plenty of agents generally do not start out with 0 bits of information about the environment. The shape of input data/action space is still useful information.

Even in designing the agent itself, it seems to me that general knowledge of human-related systems could be introduced into the architecture.

Selecting the architecture that gives us highest upper-bound for information utilization in a system is also, in some sense, inserting extra data. 

Reply
No posts to display.