“The we-intention Sellars regards as intrinsically valid-"It shallwe be the case that each of us rational beings so acts as to promote our welfare" — embodies a particular conception of what is good —namely, the welfare of rational beings. How, though, to establish the superiority of this account of the good over the rational egoist's account? Sellars lays out a strategy for doing so but despairs of carrying this strategy through:
To have this intention is to think of oneself as a member of a community consisting of all rational beings. ...
144. If ... the f...
Have you looked at a Cohen's kappa based system. Treat R1 and R2 as raters scoring rollouts. Weighted kappa lets you assign higher cost to disagreements on rollouts that matter more (high-stakes, high-update, etc.), which addresses your point about some rollouts being more critical for learning. And Fleiss's kappa generalizes to N reward functions if you want to measure agreement across an ensemble rather than just a pair.
The open question is whether κ(R_true, R_proxy) bounds policy regret in the way that EPIC/STARC metrics do for continuous rewards. My in... (read more)