I built a public arena where people attack a “pro-human” steering direction
Epistemic status: This is a small empirical writeup from an independent project. The project had basically $0 budget, and all the code, seed pairs, directions, and results are public. I started with two hypotheses that I thought were pretty likely, but both of them failed after testing. The human-rated behavior...
Jun 141