LESSWRONG
LW

2194
sjadler
4089870
Message
Dialogue
Subscribe

Former safety researcher & TPM at OpenAI, 2020-24

https://www.linkedin.com/in/sjgadler
stevenadler.substack.com

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
2sjadler's Shortform
9mo
26
No, That's Not What the Flight Costs
sjadler5d20

Thanks for writing this up! I really liked this related podcast episode with Patrick McKenzie: https://open.spotify.com/episode/1QqFw5hlHKRrjRUTVLfKRV?si=ptVmFvXQRKaPwRNTg1Ollg

I think the biggest update for me was how the rewards programs are inseparable in some sense from the airlines. I think your language too of ordinary flight being a loss leader helps to describe it as well; the airlines couldn’t just have the valuable rewards program, because having the underlying less-profitable flights that make it possible!

Reply
Transgender Sticker Fallacy
sjadler8d62

Weight can be such an extreme determinative factor in combat sports that an untrained 250-pound couch potato could walk into any boxing gym and absolutely demolish a 100-pound opponent with decades of training.

I think this is kind of beside the point, but is this really true?

I buy that it conceptually could be the case for some small number of people, but I would have expected most 100-pound opponents with decades of training to beat untrained 250-lb couch potatoes (all it seems to take is one or two good punches against someone who doesn't know how to defend themself). Maybe I'm mistaken?

(PS - I laughed at the "Classic ostrich-and-egg problem" line!)

Reply
A non-review of "If Anyone Builds It, Everyone Dies"
sjadler9d10

(Appreciate the correction re my nit, edited mine as well)

Reply1
A non-review of "If Anyone Builds It, Everyone Dies"
sjadler9d*72

Thanks for taking the time to write up your reflections. I agree that the before/after distinction seems especially important (‘only one shot to get it right’), and a crux that I expect many non-readers not to know about the EY/NS worldview.

I’m wondering about your take in this passage:

In the book they make an analogy to a ladder where every time you climb it you get more rewards but once you reach the top rung then the ladder explodes and kills everyone. However, our experience so far with AI does not suggest that this is a correct world view.

I’m curious what about the world’s experience with AI seems to falsify it from your POV? / casts doubt upon it? Is it about believing that systems have become safer and more controlled over time?

(Nit, but the book doesn’t posit that the explosion happens at the top rung; in that case, we could just avoid ever reaching the top rung. It posits that the explosion happens at a not-yet-known rung, and so each successive rung climb carries some risk of blow-up. I don’t expect this distinction is load-bearing for you though)

(Edit: my nit is wrong as written! Thanks Boaz - he’s right that the book’s argument is actually about the top of the ladder, I was mistaken - though with the distinction I was trying to point at, of not knowing where the top is, so from a climber’s perspective there’s no way of just avoiding that particular rung)

Reply
leogao's Shortform
sjadler11d10

This was really interesting, thanks for putting yourself in that situation and for writing it up

I was curious what examples were of therapy speak in the conversation, if you’re down to elaborate

Reply
lemonhope's Shortform
sjadler13d160

FWIW, my experience was that the utility of user data was always much higher in promise than in actual outcomes. This might have changed over time though.

Reply
Mikhail Samin's Shortform
sjadler15d10

An ask that works is, e.g., “tell the government they need to stop everyone, including us”.)

For sure, I think that would be a reasonable ask too. FWIW, I think if multiple leading AI companies did make a statement like the one outlined, I think that would increase the chance of non-complying ones being made to halt by the government, even though they hadn’t made a statement themselves. That is, even one prominent AI company making this statement then starts to widen the Overton window

Reply
The title is reasonable
sjadler17d30

Yeah fair, I think we just read that passage differently - I agree it’s a very important one though and quoted it in my own (favorable) review

But I read the “because it would succeed” eg as a claim that they are arguing for, not something definitionally inseparable from superintelligence

Regardless, thanks for engaging on this, and hope it’s helped to clarify some of the objections EY/NS are hearing

Reply
The title is reasonable
sjadler17d41

FWIW that definition of “it” wasn’t clear to me from the book. I took IABIED as arguing that superintelligence is capable of killing everyone if it wants to, not taking “superintelligence can kill everyone if it wants to” as an assumption of its argument

That is, I’d have expected “superintelligence would not be capable enough to kill us all” to be a refutation of their argument, not to be sidestepping its conditional

Reply
The title is reasonable
sjadler17d42

Nit, but I think some safety-ish evals do run periodically in the training loop at some AI companies, and sometimes fuller sets of evals get run on checkpoints that are far along but not yet the version that’ll be shipped. I agree this isn’t sufficient of course

(I think it would be cool if someone wrote up a “how to evaluate your model a reasonable way during its training loop” piece, which accounted for the different types of safety evals people do. I also wish that task-specific fine-tuning were more of a thing for evals, because it seems like one way of perhaps reducing sandbagging)

Reply
Load More
14Transcript: OpenAI's Chief Economist and COO interviewed about AI's economic impacts
2mo
0
6Mythbusting the supposed "1,000+ AI state bills that would hobble innovation"
3mo
0
5A crisis simulation changed how I think about AI risk
3mo
0
6Contain and verify: The endgame of US-China AI competition
5mo
7
17Is ChatGPT actually fixed now?
5mo
0
10Don't rely on a "race to the top"
5mo
0
13AI companies’ unmonitored internal AI use poses serious risks
6mo
2
17AI companies should be safety-testing the most capable versions of their models
6mo
6
2sjadler's Shortform
9mo
26