Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
1 comments, sorted by Click to highlight new comments since: Today at 7:40 AM
New Comment

The problem with that is that's it's Evidential Decision Theory, so it will do poorly in situations like smoking lesion. Also, it's troubling that, as you observed, it doesn't seem to correspond to maximizing a natural score. See also my old essay that defines a CDT-ish and an EDT-ish "physicalist" score, at the cost of introducting dependence on an certain parameter. More elegant solutions might be possible using logical uncertainty: see this for a (somewhat outdated) attempt and this for one way how some relevant building blocks can be formalized (I guess that using Garrabrant's logical inductors is a natural alternative approach).