So, human values are fragile, vague and possibly not even a well defined concept, yet figuring it out seems essential for an aligned AI. It seems reasonable that, faced with a hard problem, one would start instead with a simpler one that has some connection to the original problem. For someone not working in the area of ML or AI alignment, it seems obvious that researching simpler-than-human values might be a way to make progress. But maybe this is one of those false obvious ideas that non-experts tend to push after a cursory learning about a complex research topic.
That said, assuming that the value complexity scales with intelligence, studying less intelligent agents and their version of values maybe something to pursue. Dolphin values. Monkey values. Dog values. Cat values. Fish values. Amoeba values. Sure, we lose the inside view in this case, but the trade-off seems at least being worthy of exploring. Is there any research going in that area?