epistemic status: Working notes of three different people on the same question, likely useless/incomprehensible to anyone else
How to find the right abstraction level of human values
We can learn human values by observing their actions and distilling them into a preference relation. This learned preference relation can overfit human values (eg: Humans want to raise their left arm by 2 cm on 2022-05-07 if they’re in some specific place) or it can underfit human values (eg: Humans care only about maximizing money). If our preference relation overfits, we expect to not find some known biases, e.g. the Allais Paradox. There are also both inconsistencies that are “too abstract” and “too concrete”:
Independent of effects on GDP, the internet (nasdaq100) has still strongly outperformed the overall US stock market (sp500).