Btw was your goal to show me the link or to learn whether I have seen it before? If the former, then I don't need to respond. If the latter, then you want my response I guess.

[-]Robert Miles8y40

The "Whisky and Gold" environment is particularly relevant

[-]gwern8y40

It was partially to point out that you can get self-modification hazards with a substantially less complex setup than your proposal with a little hand-engineering of the agents; since none of the AI safety gridworld problems could be said to be rigorously solved, there's no need for more realistic self-modification environments.

[-]Caspar Oesterheld8y10

I list some relevant discussions of the "anvil problem" etc. here. In particular, Soares and Fallenstein (2014) seem to have implemented an environment in which such problems can be modeled.

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

4

Idea: OpenAI Gym environments where the AI is a part of the environment

4

4