I operate by Crocker's rules. All LLM output is explicitely designated as such. I have made no self-hiding agreements.
My understanding is that UDASSA doesn't give you unbounded utility, by virtue of directly assigning , and the sum of utilities is proportional to . The whole dance I did was in order to be able to have unbounded utilities. (Maybe you don't care about unbounded utilities, in which case UDASSA seems like a fine choice.)
(I think that the other horn of de Blanc's proof is satisfied by UDASSA, unless the proportion of non-halting programs bucketed by simplicity declines faster than any computable function. Do we know this? "Claude!…")
Edit: Claude made up plausible nonsense, but GPT-5 upon request was correct, proportion of halting programs declines more slowly than some computable functions.
Edit 2: Upon some further searching (and soul-searching) I think UDASSA is currently underspecified wrt whether its utility is bounded or unbounded. For example, the canonical explanation doesn't mention utility at all, and none of the other posts about it mention how exactly utility is defined..
Makes sense, but in that case, why penalize by time? Why not just directly penalize by utility? Like the leverage prior.
Huh. I find the post confusingly presented, but if I understand correctly, 15 logical inductor points to Yudkowsky₂₀₁₃—I think I invented the same concept from second principles.
Let me summarize to understand: My speed prior on both the hypotheses and the utility functions is trying to emulate just discounting utility directly (because in the case of binary tapes and integers penalizing both for the exponential of speed gets you exactly an upper bound for the utility), and a cleaner way is to set the prior to . That avoids the "how do we encode numbers" question that naturally raises itself.
Does that sound right?
(The fact that I reinvented this looks like a good thing, since that indicates it's a natural way out of the dilemma.)
I think the upper bound here is set by a program "walking" along the tape as far as possible while setting the tape to and then setting a list bit before halting (thus creating the binary number where [1]). If we interpret that number as a utility, the utility is exponential in the number of steps taken, which is why we need to penalize by instead of just [2]. If you want to write on the tape you have to make at least steps on a binary tape (and on an n-ary tape).
Technically the upper bound is , the score function. ↩︎
Thanks to GPT-5 for this point. ↩︎
epistemic status: Going out on a limb and claiming to have solved an open problem in decision theory[1] by making some strange moves. Trying to leverage Cunningham's law. Hastily written.
p(the following is a solution to Pascal's mugging in the relevant sense)≈25%[2].
Okay, setting (also here in more detail): You have a a Solomonoff inductor with some universal semimeasure as a prior. The issue is that the utility of programs can grow faster than your universal semimeasure can penalize them, e.g. a complexity prior has busy-beaver-like programs that produce amounts of utility with the program , while only being penalized by . The more general results are de Blanc 2007, de Blanc 2009 (LW discussion on the papers from 2007). We get this kind of divergence of expected utility on the prior if
The next line of attack is to use the speed prior as a prior . That prior is not bounded by a computable function from below (because it grows slower than for programs of length ), so we escape into one of de Blanc's horns. (I don't think having a computable lower bound is that important because K-complexity was never computable in the first place.)
But there's an issue: What if our hypotheses output strings that are short, but are evaluated by our utility function as being high-value anyway? That is, the utility function takes in some short string of length and outputs as its utility . This is the case if the utility function itself is a program of some computational power, in the most extreme case the utility function is Turing-complete, and our hypotheses "parasitize" on this computational power of our utility function to be a Pascal's mugging. So what we have to do is to also consider the computation of our utility function as being part of what's penalized by the prior. That is,
for being the time it takes to run the utility function on the output of . I'll call this the "reflective speed prior". Note that if you don't have an insane utility function which is Turing-complete, the speed penalty for evaluating the output of should be fairly low most of the time.
Pascal's mugging can be thought of in two parts:
I claim that the reflective speed prior solves 1., but not 2. Furthermore, and this is the important thing, if you use the reflective speed prior, the expected utility is bounded on priors, but you can have arbitrarily high maximal expected utilities after performing Bayesian updating. So you get all the good aspects of having unbounded utility without having to worry about actually getting mugged (well, unless you have something controlling the evidence you observe, which is its own issue).
(Next steps: Reading the two de Blanc papers carefully, trying to suss out a proof, writing up the argument in more detail. Think/argue about what it means to update your prior in this strange way, and specifically penalizing hypotheses by how long it takes your utility function to evaluate them. Figure out which of these principles are violated. (On an initial read: Definitely Anti-Timidity.) Changing ones prior in a "superupdate" has been discussed here and here.)
Edit: Changed from penalizing the logarithm of runtime and utility-runtime to penalizing it linearly, after feedback from GPT-5.
Best I can tell, the risk of psychosis is much higher with Goenka style retreats, although I don't have hard numbers, only anecdotal evidence and theory that suggests it should be more common.
My experience has been that anything except long intensive retreats don't move my mind out of its default attractor state, and that I probably waited too long to attend do long retreats, and all the advice goes in the opposite direction.
I mention this because all the talk of the downside of meditation have me thinking of a tweet that goes roughly like "why do both republicans and democrats pretend HRT does anything". Goenka retreats have medium-strength effects on me, an intensive one-month retreat at home had a decent effect. I may be doing something wrong.
Yup, that's correct if I remember the sources correctly. I guess the tone surrounding it doesn't match that particular bit of content. I should also turn the pledged/received numbers into a table for easier reading.
Yup, it's a regionalism that I mis-/over-generalized. I'll avoid it from now on.
It is, and it's the thing I'd most like Smil to read if I could recommend something to him.
I'm revisiting this post after listening to this section of this recent podcast with Holden Karnofsky.
Seems like this post was overly optimistic in what RSPs would be able to enforce/not quite clear on different scenarios for what "RSP" could refer to. Specifically, this post was equivocating between "RSP as a regulation that gets put into place" vs. "RSP as voluntary commitment"—we got the latter, but not really the former (except maybe in the form of the EU Codes of Practice).
Even at Anthropic, the way the RSP is put into practice is now basically completely excluding a scaling pause from the picture:
Interview:
and
Furthermore, what apparently happens now is that really difficult commitments either don't get made or get walked back on:
Interview:
and
I guess the unwillingness of the government to turn RSPs into regulation is what ultimately blocked this. (Though maybe today even a US-centric RSP-like regulation would be considered "not that useful" because of geopolitical competition). We got RSP-like voluntary commitments from a surprising number of AI companies (so good job on predicting the future on this one) but that didn't get turned into regulation.