Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a linkpost for https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRPiprOaC3HsCf5Tuum8bRfzYUiKLRqJmbOoC-32JorNdfyTiRRsR7Ea5eWtvsWzuxo8bjOxCG84dAg/pubhtml
Interesting list of examples where AI programs gamed the specification, solving the problem in rather creative (or dumb) ways not intended by the programmers.
These are great (and terrifying).
It’s hard to pick just one favorite, but I think I’ll go with that amazing last entry:
Literally “hacking the Matrix to gain superpowers”.
Rereading this a year later and holy christ that example is great and terrifying.
Also recently discussed on Hacker News: https://news.ycombinator.com/item?id=18415031
As a result of the recent attention, the specification gaming list has received a number of new submissions, so this is a good time to check out the latest version :).
I noticed this has already been posted to Lesswrong here: https://www.lesswrong.com/posts/AanbbjYr5zckMKde7/specification-gaming-examples-in-ai
Should I delete the post?
Seems fine to leave here, as long as we link to the other place, and the other place links to here.