Value learning in the absence of ground truth
Epistemic Status: My best guess. Summary The problem of aligning AI is often phrased as the problem of aligning AI with human values, where “values” refers roughly to the normative considerations one leans upon in making decisions. Understood in this way, solving the problem would amount to first articulating the...
Could you clarify who you're referring to by "strong" agents? You refer to humans as weak agents at the start, yet claim later that AIs should have human inductive biases from the beginning, which makes me think humans are the strong ones.