From "Coulda" and "Woulda" to "Shoulda": Predicting Decisions to Minimize Regret for Partially Rational Agents

(idle bemusement)

Does an optimal superintelligence regret? They know they couldn't have made a better choice given its past information about the environment. How is regret useful in that case?

[-][anonymous]12y00

An optimal superintelligence has a regret probability of $\\epsilon$ , and it's not actually useful. This regret construction is meant to construct loss functions for strictly non-optimal agents.

[-]torekp12y00

I think I like your idea better than you do. I'm not convinced of the irrationality of not having a standard utility function such as dictated by the VNM axioms. It's not just a bad description of humans, I think; it's a non-binding norm. I'm not sure exactly which axiom(s) are the troublemakers, but Seidenfeld has an interesting discussion.

So I welcome an approach with weaker standards.

Edit: Ha, I picked a fine time to say this.

[-]Punoxysm12y00

I think this is concretely useful (and explored) as a psychological device to counter hyperbolic discounting. Think of what your future self will think about your action. Sometimes this is very useful, sometimes not.

There are also very formal, statistical methods for minimizing a mathematical quantity called regret (usually in the context of choosing between the advice of different oracles), that measures the difference in utility between taking the advice of the best overall oracle and what you actually did. The results on this aren't strong enough to, say, help you win the stock market, but could probably be applied to certain types of self-tracking.

In cases with defined and controlled risk, e.g. investing, you should be prepared to not regret a loss if you believe your model was correct.

But in more general life decisions, where bias is much more prevalent, regret can be useful.

Holistically, the main message is "reflect on your actual experience and what you actually know about your preference, and not what you hope the future will be like or what you wish your own preferences were like".

[-][anonymous]12y10

There are also very formal, statistical methods for minimizing a mathematical quantity called regret (usually in the context of choosing between the advice of different oracles), that measures the difference in utility between taking the advice of the best overall oracle and what you actually did.

Yes, I know. The actual point of the post was to provide a gedanken-experiment for oracle construction that doesn't rely on a theory of value.

CEV and Railton's moral realism both basically say, "I should do what my ideal adviser would say to do in my place". This makes perfect sense, except that actually trying to construct an ideal adviser involves giving him perfect instrumental rationality, which - in the common definition of rationality as regret minimization or utility maximization - involves a theory of value, which is exactly what you were constructing in the first place.

So what I was doing here was trying to reduce those two theories one step closer to reality by saying, well, we still need to do some godawful thought-experiment, but we can do it in a way that doesn't involve circular logic.

Any usefulness as an actual heuristic for motivating oneself to real action is completely coincidental, but pretty definitely existent.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

10

From "Coulda" and "Woulda" to "Shoulda": Predicting Decisions to Minimize Regret for Partially Rational Agents

10

10