Jessica Taylor. CS undergrad and Master's at Stanford; former research fellow at MIRI.
I work on decision theory, social epistemology, strategy, naturalized agency, mathematical foundations, decentralized networking systems and applications, theory of mind, and functional programming languages.
Blog: unstableontology.com
Twitter: https://twitter.com/jessi_cata
I analyzed a general class of these problems here. Upshot: every optimal UDT solution is also an optimal CDT+SIA solution, but not vice versa.
If some function g is computable in O(f(n)) time for primitive recursive f then g is primitive recursive, by simulating a Turing machine. I am pretty sure a logical inductor would satisfy; while it's super exponential time, it's not so fast-growing it's not primitive recursive (like with the Ackerman function).
Oh, to be clear I do think that AI safery automation is a well targeted x risk effort conditioned on the AI timelines you are presenting. (Related to Paul Christiano alignment ideas, which are important conditional on prosaic AI)
On EV grounds, "2/3 chance it's irrelevant because of AGI in the next 20 years" is not a huge contributor to the EV of this. Because, ok, maybe it reduces the EV by 3x compared to what it would otherwise have been. But there are much bigger than 3x factors that are relevant. Such as, probability of success, magnitude of success, cost effectiveness.
Then you can take the overall cost effectiveness estimate (by combining various factors including probability it's irrelevant due to AGI being too soon) and compare it to other interventions. Here, you're not offering a specific alternative that is expected to pay off in worlds with AGI in the next 20 years. So it's unclear how "it might be irrelevant if AGI is in the next 20 years" is all that relevant as a consideration.
Wasn't familiar. Seems similar in that facts/values are entangled. I was more familiar with Cuneo for that.
Dunno; gym membership also feels like a form of blackmail (although preferable to the alternative forms of blackmail), while home gym reduces the inconvenience of exercising.
I'm not sure what differentiates these in your mind. They both reduce the inconvenience of exercising, presumably? Also, in my post I'm pretty clear that it's not meant as a punishment type incentive:
And it’s prudent to take into account the chance of not exercising in the future, making the investment useless: my advised decision process counts this as a negative, not a useful self-motivating punishment.
...
Generally, it seems like the problem is signaling. You buy the gym membership to signal your strong commitment to yourself. Then you feel good about sending a strong signal. And then the next day you feel just as lazy as previously, and the fact that you already paid for the membership probably feels bad.
That's part of why I'm thinking an important step is checking whether one expects the action to happen if the initial steps are taken. If not then it's less likely to be a good idea.
There is some positive function of the signaling / hyperstition, but it can lead people to be unnecessarily miscalibrated.
Okay, I don't think I was disagreeing except in cases of very light satisficer-type self-commitments. Maybe you didn't intend to express disagreement with the post, idk.
So far I don't see evidence that any LessWrong commentator has read the post or understood the main point.
Set of states: {X, Y, XE, YE, YA}
Set of actions: {Advance, Exit}
Set of observations: {""}
Initial state: X
Transition function for non-terminal states:
t(X, Advance) = Y
t(X, Exit) = XE
t(Y, Advance) = YA
t(Y, Exit) = YE
Terminal states: {XE, YE, YA}
Utility function:
u(XE) = 0
u(YE) = 4
u(YA) = 1
Policy can be parameterized by p = exit probability.
Frequencies for non-terminal states (un-normalized):
Fp(X)=1
Fp(Y)=1−p
SIA un-normalized probabilities:
SIAp(X|"")=1
SIAp(Y|"")=1−p
Note we have only one possible observation, so SIA un-normalized probabilities match frequencies.
State values for non-terminals:
Vp(Y)=4p+1−p=3p+1
Vp(X)=(1−p)Vp(Y)=(1−p)(3p+1)=−3p2+2p+1
Q values for non-terminals:
Qp(Y,Exit)=4
Qp(Y,Advance)=1
Qp(X,Exit)=0
Qp(X,Advance)=Vp(Y)=3p+1
For local optimality we compute partial derivatives.
dp("",Exit,Advance,Y)=Qp(Y,Exit)−Qp(Y,Advance)=3
dp("",Exit,Advance,X)=(1−p)∗dp("",Exit,Advance,Y)+Qp(X,Exit)−Qp(X,Advance)=3(1−p)−(3p+1)=2−6p
By the post, this can be equivalently written:
dp("",Exit,Advance,X)=Fp(X)(Qp(X,Exit)−Qp(X,Advance))+Fp(Y)(Qp(Y,Exit)−Qp(Y,Advance))=−(3p+1)+(1−p)(4−1)=−3p−1+3−3p=2−6p
To optimize we set 2−6p=0 i.e. p=1/3. Remember p is the probability of exit (sorry it's reversed from your usage!). This matches what you computed as globally optimal.
I'm not going to do this for all of the examples... Is there a specific example where you think the theorem fails?