sjadler — LessWrong

Former safety researcher & TPM at OpenAI, 2020-24

https://www.linkedin.com/in/sjgadler
stevenadler.substack.com

Ah dang, yeah I haven’t gotten there yet, will keep an ear out

That’s a bummer. I’ve only listened partway but was actually impressed so far with how Eliezer presented things, and felt like whatever media prep has been done has been quite helpful

Thanks for writing this up! I really liked this related podcast episode with Patrick McKenzie: https://open.spotify.com/episode/1QqFw5hlHKRrjRUTVLfKRV?si=ptVmFvXQRKaPwRNTg1Ollg

I think the biggest update for me was how the rewards programs are inseparable in some sense from the airlines. I think your language too of ordinary flight being a loss leader helps to describe it as well; the airlines couldn’t just have the valuable rewards program, because having the underlying less-profitable flights that make it possible!

Weight can be such an extreme determinative factor in combat sports that an untrained 250-pound couch potato could walk into any boxing gym and absolutely demolish a 100-pound opponent with decades of training.

I think this is kind of beside the point, but is this really true?

I buy that it conceptually could be the case for some small number of people, but I would have expected most 100-pound opponents with decades of training to beat untrained 250-lb couch potatoes (all it seems to take is one or two good punches against someone who doesn't know how to defend themself). Maybe I'm mistaken?

(PS - I laughed at the "Classic ostrich-and-egg problem" line!)

(Appreciate the correction re my nit, edited mine as well)

Thanks for taking the time to write up your reflections. I agree that the before/after distinction seems especially important (‘only one shot to get it right’), and a crux that I expect many non-readers not to know about the EY/NS worldview.

I’m wondering about your take in this passage:

In the book they make an analogy to a ladder where every time you climb it you get more rewards but once you reach the top rung then the ladder explodes and kills everyone. However, our experience so far with AI does not suggest that this is a correct world view.

I’m curious what about the world’s experience with AI seems to falsify it from your POV? / casts doubt upon it? Is it about believing that systems have become safer and more controlled over time?

(Nit, but the book doesn’t posit that the explosion happens at the top rung; in that case, we could just avoid ever reaching the top rung. It posits that the explosion happens at a not-yet-known rung, and so each successive rung climb carries some risk of blow-up. I don’t expect this distinction is load-bearing for you though)

(Edit: my nit is wrong as written! Thanks Boaz - he’s right that the book’s argument is actually about the top of the ladder, I was mistaken - though with the distinction I was trying to point at, of not knowing where the top is, so from a climber’s perspective there’s no way of just avoiding that particular rung)

This was really interesting, thanks for putting yourself in that situation and for writing it up

I was curious what examples were of therapy speak in the conversation, if you’re down to elaborate

FWIW, my experience was that the utility of user data was always much higher in promise than in actual outcomes. This might have changed over time though.

An ask that works is, e.g., “tell the government they need to stop everyone, including us”.)

For sure, I think that would be a reasonable ask too. FWIW, I think if multiple leading AI companies did make a statement like the one outlined, I think that would increase the chance of non-complying ones being made to halt by the government, even though they hadn’t made a statement themselves. That is, even one prominent AI company making this statement then starts to widen the Overton window

Yeah fair, I think we just read that passage differently - I agree it’s a very important one though and quoted it in my own (favorable) review

But I read the “because it would succeed” eg as a claim that they are arguing for, not something definitionally inseparable from superintelligence

Regardless, thanks for engaging on this, and hope it’s helped to clarify some of the objections EY/NS are hearing

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments