New Comment
3 comments, sorted by Click to highlight new comments since: Today at 5:26 PM

I recently interviewed with Epoch, and as part of a paid work trial they wanted me to write up a blog post about something interesting related to machine learning trends. This is what I came up with:

So, I had a thought.  The glory system idea that I posted about earlier, if it leads to a successful, vibrant democratic community forum, could actually serve as a kind of dataset for value learning.  If each post has a number attached to it that indicates the aggregated approval of human beings, this can serve as a rough proxy for a kind of utility or Coherent Aggregated Volition.

Given that individual examples will probably be quite noisy, but averaged across a large amount of posts, it could function as a real world dataset, with the post content being the input, and the post's vote tally being the output label.  You could then train a supervised learning classifier or regressor that could then be used to guide a Friendly AI model, like a trained conscience.

This admittedly would not be provably Friendly, but as a vector of attack for the value learning problem, it is relatively straightforward to implement and probably more feasible in the short-run than anything else I've encountered.

Another thought is that maybe Less Wrong itself, if it were to expand in size and become large enough to roughly represent humanity, could be used as such a dataset.