Although I’ve read many posts, this is my first contribution to LessWrong, so here goes:
I’m working on a personal project to develop a system to evaluate different AI policy proposals, but I realized that I didn’t know how to choose which criteria to use--so I’m looking for a systematic way to go about this. In other words, I’m looking for “meta-criteria” that I could use to pick good criteria. (I realize that this could go into an infinite regress, but I think it could be useful to go at least a few layers deeper, just to be a bit more confident in our choices.)
Are there any existing resources or papers that might be helpful?
I haven’t found exactly what I’m looking for so far, but I’m sure there’s something out there, since this question is relevant to many fields. I know that there’s been a lot of research done on picking optimization criteria (for AI safety, etc.), and I know that other scientific fields have developed methods to find good ways to measure things (like validity and reliability tests). Is there any literature that tries to attack these problems in a more abstract way?
In lieu of developed research, does anyone have ideas about how to approach selecting proper criteria?
Any insights would help!