1 min read25th Jan 202313 comments
This is a special post for quick takes by RobertM. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

13 comments, sorted by Click to highlight new comments since: Today at 3:53 PM

I am pretty concerned that most of the public discussion about risk from e.g. the practice of open sourcing frontier models is focused on misuse risk (particular biorisk).  Misuse risk seems like it could be a real thing, but it's not where I see most of the negative EV, when it comes to open sourcing frontier models.  I also suspect that many people doing comms work which focuses on misuse risk are focusing on misuse risk in ways that are strongly disproportionate to how much of the negative EV they see coming from it, relative to all sources.

I think someone should write a summary post covering "why open-sourcing frontier models and AI capabilities more generally is -EV".  Key points to hit:

  • (1st order) directly accelerating capabilities research progress
  • (1st order) we haven't totally ruled out the possibility of hitting "sufficiently capable systems" which are at least possible in principle to use in +EV ways, but which if made public would immediately have someone point them at improving themselves and then we die.  (In fact, this is very approximately the mainline alignment plan of all 3 major AGI orgs.)
  • (2nd order) generic "draws in more money, more attention, more skilled talent, etc" which seems like it burns timelines

And, sure, misuse risks (which in practice might end up being a subset of the second bullet point, but not necessarily so).  But in reality, LLM-based misuse risks probably don't end up being x-risks, unless biology turns out to be so shockingly easy that a (relatively) dumb system can come up with something that gets ~everyone in one go.

Headline claim: time delay safes are probably much too expensive in human time costs to justify their benefits.

The largest pharmacy chains in the US, accounting for more than 50% of the prescription drug market[1][2], have been rolling out time delay safes (to prevent theft)[3].  Although I haven't confirmed that this is true across all chains and individual pharmacy locations, I believe these safes are used for all controlled substances.  These safes open ~5-10 minutes after being prompted.

There were >41 million prescriptions dispensed for adderall in the US in 2021[4].  (Note that likely means ~12x fewer people were prescribed adderall that year.)   Multiply that by 5 minutes and you get >200 million minutes, or >390 person-years, wasted.  Now, surely some of that time is partially recaptured by e.g. people doing their shopping while waiting, or by various other substitution effects.  But that's also just adderall!

Seems quite unlikely that this is on the efficient frontier of crime-prevention mechanisms, but alas, the stores aren't the ones (mostly) paying the costs imposed by their choices, here.

  1. ^

    https://www.mckinsey.com/industries/healthcare/our-insights/meeting-changing-consumer-needs-the-us-retail-pharmacy-of-the-future

  2. ^

    https://www.statista.com/statistics/734171/pharmacies-ranked-by-rx-market-share-in-us/

  3. ^

    https://www.cvshealth.com/news/pharmacy/cvs-health-completes-nationwide-rollout-of-time-delay-safes.html

  4. ^

    https://www.axios.com/2022/11/15/adderall-shortage-adhd-diagnosis-prescriptions

It seems like the technology you would want is one where you can get one Adderal box immediately but not all Adderal boxes that the store has at the premises.

Essentially, a big vending machine that might have 10 minutes to unlock to restock the vending machine but that can only give up one Adderal box per five minutes in its vending machine mode.

Now, surely some of that time is partially recaptured by e.g. people doing their shopping while waiting

That sounds like the technique might encourage customers to buy non-prescription medication in the pharmacy along with the prescription medicine they want to buy.

I think there might be many local improvements, but I'm pretty uncertain about important factors like elasticity of "demand" (for robbery) with respect to how much of a medication is available on demand.  i.e. how many fewer robberies do you get if you can get at most a single prescriptions' worth of some kind of controlled substance (and not necessarily any specific one), compared to "none" (the current situation) or "whatever the pharmacy has in stock" (not actually sure if this was the previous situation - maybe they had time delay safes for storing medication that wasn't filling a prescription, and just didn't store the filled prescriptions in the safes as well)?

NDAs sure do seem extremely costly.  My current sense is that it's almost never worth signing one, or binding oneself to confidentiality in any similar way, for anything except narrowly-scoped technical domains (such as capabilities research).

Say more please.

As a recent example, from this article on the recent OpenAI kerfufle:

Two people familiar with the board’s thinking say that the members felt bound to silence by confidentiality constraints.

If you don't have more examples, I think 

  1. it is too early to draw conclusions from OpenAI
  2. one special case doesn't invalidate the concept

Not saying your point is wrong, just that this is not convincing me.

I have more examples, but unfortunately some of them I can't talk about.  A few random things that come to mind:

  • OpenPhil routinely requests that grantees not disclose that they've received an OpenPhil grant until OpenPhil publishes it themselves, which usually happens many months after the grant is disbursed.
  • Nearly every instance that I know of where EA leadership refused to comment on anything publicly post-FTX due to advice from legal counsel.
  • So many things about the Nonlinear situation.
  • Coordination Forum requiring attendees agree to confidentiality re: attendance and content of any conversations with people who wanted to attend but not have their attendance known to the wider world, like SBF, and also people in the AI policy space.

That explains why the NDAs are costly. But if you don't sign one, you can't e.g. get the OpenPhil grant. So the examples don't explain how "it's almost never worth signing one".

Not all of these are NDAs; my understanding is that the OpenPhil request comes along with the news of the grant (and isn't a contract).  Really my original shortform should've been a broader point about confidentiality/secrecy norms, but...

Reducing costs equally across the board in some domain is bad news in any situation where offense is favored. Reducing costs equally-in-expectation (but unpredictably, with high variance) can be bad even if offense isn't favored, since you might get unlucky and the payoffs aren't symmetrical.

(re: recent discourse on bio risks from misuse of future AI systems.  I don't know that I think those risks are particularly likely to materialize, and most of my expected disutility from AI progress doesn't come from that direction, but I've seen a bunch of arguments that seem to be skipping some steps when trying to argue that progress on ability to do generic biotech is positive EV.  To be fair, the arguments for why we should expect it to be negative EV are often also skipping those steps.  My point is that a convincing argument in either direction needs to justify its conclusion in more depth; the heuristics I reference above aren't strong enough to carry the argument.)

We have models that demonstrate superhuman performance in some domains without then taking over the world to optimize anything further. "When and why does this stop being safe" might be an interesting frame if you find yourself stuck.