LESSWRONG
LW

1288
Steven Weiss
-1020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Humans are very reliable agents
Steven Weiss3y10

Solid post, but I think in particular mentioning GPT distracts from the main point. GPT is a generative model with no reward function, meaning it has no goals that it's optimizing for. It's not engineered for reliability (or any other goal), so it's not meaningful to compare it's performance against humans in goal-oriented tasks.

Reply
Humans are very reliable agents
Steven Weiss3y*00

If you assume a 1:20,000 random suicide rate and that 40% of people can kill themselves in a minute (roughly, the US gun ownership rate), then the rate of not doing it per decision is ~20,000 * 60 * 16 * 365 * 0.4 = 1:3,000,000,000, or ~99.99999997%.

IIUC, people aren't deciding whether to kill themselves once a minute, every minute. The thought only comes up when things are really rough, and thinking about it can take hours or days. That's probably a nitpick.

More importantly, an agent optimizing for not intentionally shooting itself in the face would probably be much more reliable at it than a human. It just has to sit still.

If you look at RL agents in simulated environments where death is possible (e.g. Atari games), the top agents outperform most human counterparts at not dying in most games. E.g. the MuZero average score in Space Invaders is several times higher than the average human baseline, which would require it die less often on average.

So when an agent is trained to not die, it can be very efficient at it.

Reply
No posts to display.