LESSWRONG
LW

515
niplav
53614091033
Message
Dialogue
Subscribe

I operate by Crocker's rules. All LLM output is explicitely designated as such. I have made no self-hiding agreements.

Website.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Acausal Trade
3shortplav
5y
348
Wei Dai's Shortform
niplav2d20

Realizing that there are probably universes with vastly greater computational resources than ours, implying there are more simulations containing me than I had thought.

What made you believe that?

I find it hard to even conceptualize how to think through something like that, including the anthropics, which computationally powerful universes to admit, &c.

My intuition is that allowing universes with hypercomputation puts us in a dovetailer being run almost surely somewhere in the most computationally powerful universes, but that this all introduces a ton of difficulties into reasoning about the multiverse and our position inside of it.

Reply
Alex_Altair's Shortform
niplav2d20

You may be interested in this article and its successors which looks at a specific type of commutative hyperoperator.

Reply
shortplav
niplav4d1513

Prediction: Superpersuasion in the relevant sense¹ is possible, and people who claim it isn't will look silly in retrospect if we are around to test it.

Human brains have insane data throughput, large number of degrees of freedom, no optimization against superintelligent adversaries. If we can't solve adversarial examples for neural nets then evolution definitely didn't solve them for humans.

Reasons this might not work is heterogenous brain structure between humans, too much noise. But humans have some shared structure (how to quantify this is tricky, according to Claude even a bunch of cortical fine structure is shared), and there's still enough degrees of freedom here for vast optimization. Noise is harder to defeat but if an AI gets decent feedback loop speed then it may not matter as much. Still, noise is plausibly the hardest bottleneck to overcome.

I don't think conscious processing bottleneck is a problem, because most action-relevant stuff doesn't actually go through the central part of the global workspace.

Also worth distinguishing data throughput (video>audio≫text), speed of feedback (again same modality ranking).

Relevant thought experiment: Could we create an infohazardous image or audioclip by doing gradient descent on a latent space with direct biofeedback? MKUltra is relevant but they were in the 50s to 70s so their science & tech was also bad.

Think control theory perspective, you can't predict a game of pinball.

¹: Namely, an advanced AI can convince the leadership of the company that develops it to perform ~any action, irrespectively of their beliefs. Predicated on some sort of multimodal I/O OR very long continuous interactions between the AI and the lab personnel (with some memory bank).

Reply
Alexander Gietelink Oldenziel's Shortform
niplav10d91

Possible synthesis (not including the newest models):

Reply
People Seem Funny In The Head About Subtle Signals
niplav12d*360

Hm, I am unsure how much to believe this, even though my intuitions go the same way as yours. As a correlational datapoint, I tracked my success from cold approach and the time I've spent meditating (including a 2-month period of usually ~2 hours of meditation/day), and don't see any measurable improvement in my success rate from cold approach:

(Note that the linked analysis also includes a linear regression of slope -6.35e-08, but with p=0.936, so could be random.)

In cases where meditation does stuff to your vibe-reading of other people, I would guess that I'd approach women who are more open to being approached. I haven't dug deeper into my fairly rich data on this, and the data doesn't include much post-retreat approaches, but I still find the data I currently have instructive.

I wish more people tracked and analyzed this kind of data, but I seem alone in this so far. I do feel some annoyance at everyone (the, ah, "cool people"?) in this area making big claims (and sometimes money off of those claims) without even trying to track any data and analyze it, leaving it basically to me to scramble together some DataFrames and effect sizes next to my dayjob.[1]

So start meditating for an hour a day for 3 months using the mind illuminated as an experiment (getting some of the cool skills mentioned in Kaj Sotala's sequence?) and see what happens?

Do you have any concrete measurable predictions for what would happen in that case?


  1. I often wonder if empiricism is just incredibly unintuitive for humans in general, and experimentation and measurement even more so. Outside the laboratory very few people do it, and see e.g. Aristotle's claims about the number of women's teeth or his theory of ballistics, which went un(con)tested for almost 2000 years? What is going on here? Is empiricism really that hard? Is it about what people bother to look at? Is making shit up just so much easier so that everyone keeps in that mode, which is a stable equilibrium? ↩︎

Reply5
Heroic Responsibility
niplav13d20

Reminds me of one of my favourite essays, Software engineers solve problems (Drew DeVault, 2020).

Reply
RSPs are pauses done right
niplav18d958

I'm revisiting this post after listening to this section of this recent podcast with Holden Karnofsky.

Seems like this post was overly optimistic in what RSPs would be able to enforce/not quite clear on different scenarios for what "RSP" could refer to. Specifically, this post was equivocating between "RSP as a regulation that gets put into place" vs. "RSP as voluntary commitment"—we got the latter, but not really the former (except maybe in the form of the EU Codes of Practice).

Even at Anthropic, the way the RSP is put into practice is now basically completely excluding a scaling pause from the picture:

RSPs are pauses done right: if you are advocating for a pause, then presumably you have some resumption condition in mind that determines when the pause would end. In that case, just advocate for that condition being baked into RSPs!

Interview:

That was never the intent. That was never what RSPs were supposed to be; it was never the theory of change and it was never what they were supposed to be... So the idea of RSPs all along was less about saying, 'We promise to do this, to pause our AI development no matter what everyone else is doing'

and

But we do need to get rid of some of this unilateral pause stuff.

Furthermore, what apparently happens now is that really difficult commitments either don't get made or get walked back on:

Since the strictest conditions of the RSPs only come into effect for future, more powerful models, it's easier to get people to commit to them now. Labs and governments are generally much more willing to sacrifice potential future value than realized present value.

Interview:

So I think we are somewhat in a situation where we have commitments that don't quite make sense... And in many cases it's just actually, I would think it would be the wrong call. In a situation where others were going ahead, I think it'd be the wrong call for Anthropic to sacrifice its status as a frontier company

and

Another lesson learned for me here is I think people didn't necessarily think all this through. So in some ways you have companies that made commitments that maybe they thought at the time they would adhere to, but they wouldn't actually adhere to. And that's not a particularly productive thing to have done.

I guess the unwillingness of the government to turn RSPs into regulation is what ultimately blocked this. (Though maybe today even a US-centric RSP-like regulation would be considered "not that useful" because of geopolitical competition). We got RSP-like voluntary commitments from a surprising number of AI companies (so good job on predicting the future on this one) but that didn't get turned into regulation.

Reply
shortplav
niplav19d42

It's a bit of a travesty there's no canonical formal write-up of UDASSA, given all the talk about it. Ugh, TODO for working on this I guess.

Reply1
shortplav
niplav19d20

My understanding is that UDASSA doesn't give you unbounded utility, by virtue of directly assigning U(eval(p))∝2−|p|, and the sum of utilities is proportional to ∑∞i=02−i=2. The whole dance I did was in order to be able to have unbounded utilities. (Maybe you don't care about unbounded utilities, in which case UDASSA seems like a fine choice.)

(I think that the other horn of de Blanc's proof is satisfied by UDASSA, unless the proportion of non-halting programs bucketed by simplicity declines faster than any computable function. Do we know this? "Claude!…")

Edit: Claude made up plausible nonsense, but GPT-5 upon request was correct, proportion of halting programs declines more slowly than some computable functions.

Edit 2: Upon some further searching (and soul-searching) I think UDASSA is currently underspecified wrt whether its utility is bounded or unbounded. For example, the canonical explanation doesn't mention utility at all, and none of the other posts about it mention how exactly utility is defined..

Reply
shortplav
niplav19d40

Makes sense, but in that case, why penalize by time? Why not just directly penalize by utility? Like the leverage prior.

Huh. I find the post confusingly presented, but if I understand correctly, 15 logical inductor points to Yudkowsky₂₀₁₃—I think I invented the same concept from second principles.

Let me summarize to understand: My speed prior on both the hypotheses and the utility functions is trying to emulate just discounting utility directly (because in the case of binary tapes and integers penalizing both for the exponential of speed gets you exactly an upper bound for the utility), and a cleaner way is to set the prior to 2−|p|⋅1U(eval(p)). That avoids the "how do we encode numbers" question that naturally raises itself.

Does that sound right?

(The fact that I reinvented this looks like a good thing, since that indicates it's a natural way out of the dilemma.)

Reply
Load More
163Humanity Learned Almost Nothing From COVID-19
1mo
38
14Ontological Cluelessness
2mo
12
21Anti-Superpersuasion Interventions
4mo
1
36Meditation and Reduced Sleep Need
8mo
8
24Logical Correlation
9mo
7
39Resolving von Neumann-Morgenstern Inconsistent Preferences
1y
5
380.836 Bits of Evidence In Favor of Futarchy
1y
0
14Pomodoro Method Randomized Self Experiment
1y
2
21How Often Does Taking Away Options Help?
1y
7
47Michael Dickens' Caffeine Tolerance Research
1y
5
Load More
Comp-In-Sup
a month ago
(+13/-13)
AI-Assisted Alignment
6 months ago
(+54)
AI-Assisted Alignment
6 months ago
(+127/-8)
Recursive Self-Improvement
6 months ago
(+68)
Alief
7 months ago
(+11/-11)
Old Less Wrong About Page
8 months ago
Successor alignment
10 months ago
(+26/-3)
Cooking
a year ago
(+26/-163)
Future of Humanity Institute (FHI)
2 years ago
(+11)
Future of Humanity Institute (FHI)
2 years ago
(+121/-49)
Load More