It appears that in the last few years the AI Alignment community has dedicated great attention to the Value Learning Problem [1]. In particular, the work of Stuart Armstrong stands out to me.

Concurrently, during the last decade, researcher such as Eyke Hüllermeier Johannes Fürnkranz produced a significant amount of work on the topics of preference learning [2] and preference-based reinforcement learning [3].

While I am not highly familiar with the Value Learning literature, I consider the two fields closely related if not overlapping, but I have not often seen references the Preference Learning work, and vice-versa.

Is this because the two fields are less related than what I think? And more specifically, how do the two fields relate with each other?


References

[1] - Soares, Nate. "The value learning problem." Machine Intelligence Research Institute, Berkley (2015).

[2] - Fürnkranz, Johannes, and Eyke Hüllermeier. Preference learning. Springer US, 2010.

[3] - Fürnkranz, Johannes, et al. "Preference-based reinforcement learning: a formal framework and a policy iteration algorithm." Machine learning 89.1-2 (2012): 123-156.

New Answer
New Comment

1 Answers sorted by

Gordon Seidoh Worley

Jan 14, 2020

30

The short answer is that yes, they are related and basically about the same thing. However the approaches of researchers vary a lot.

Relevant considerations that come to mind:

  • The extent to which values/preferences are legible
  • The extent to which they are discoverable
  • The extent to which they are hidden variables
  • The extent to which they are normative
  • How important immediate implementability is
  • How important extreme optimization is
  • How important safety concerns are

The result is that I think there is something of a divide between safety-focused researchers and capabilities-focused researchers in this area due to different assumptions and that makes each others work not very interesting/relevant to the other cluster.

Interesting points. The distinctions you mention could equally apply in distinguishing narrow from ambitious value learning. In fact, I think preference learning is pretty much the same as narrow value learning. Thus, could it be that ambitious value learning research may not be very interested in preference learning to a similar extent in which they are not interested in narrow value learning?

"How important safety concerns" is certainly right, but the story of science teaches us that taking something from a domain with different concerns to another domain has often proven extremely useful.