Helen Toner went on the TED AI podcast, giving us more color on what happened at OpenAI. These are important claims to get right.

I will start with my notes on the podcast, including the second part where she speaks about regulation in general. Then I will discuss some implications more broadly.

Notes on Helen Toner’s TED AI Show Podcast

This seems like it deserves the standard detailed podcast treatment. By default each note’s main body is description, any second-level notes are me.

(0:00) Introduction. The host talks about OpenAI’s transition from non-profit research organization to de facto for-profit company. He highlights the transition from ‘open’ AI to closed as indicative of the problem, whereas I see this as the biggest thing they got right. He also notes that he was

...

(Continue Reading – 3770 more words)

9MichaelDickens2h

He was caught lying about the non-disparagement agreements, but I guess lying to the public is fine as long as you don't lie to the board? Taylor's and Summers' comments here are pretty disappointing—it seems that they have no issue with, and maybe even endorse, Sam's now-publicly-verified bad behavior.

Lukas_Gloor18m42

we have found Mr Altman highly forthcoming

That's exactly the line that made my heart sink.

I find it a weird thing to choose to say/emphasize.

The issue under discussion isn't whether Altman hid things from the new board; it's whether he hid things to the old board a long while ago.

Of course he's going to seem forthcoming towards the new board at first. So, the new board having the impression that he was forthcoming towards them? This isn't information that helps us much in assessing whether to side with Altman vs the old board. That makes me think: why repo... (read more)

3PeterH2h

I think that Paul Graham’s remarks today—particularly the “we didn’t want him to leave” part—make it clear that Altman was not fired. In December 2023, Paul Graham gave a similar account to the Wall St Journal and said “it would be wrong to use the word ‘fired’”. Roon has a take.

1Dana1h

These are the remarks Zvi was referring to in the post. Also worth noting Graham's consistent choice of the word 'agreed' rather than 'chose', and Altman's failed attempt to transition to chairman/advisor to YC. It sure doesn't sound like Altman was the one making the decisions here.

How have analogous Industries solved Interested > Trained > Employed bottlenecks?

yanni kyriacos

It seems like one of the biggest problems* in AI Safety is that it is ridiculously hard to get good training (i.e. MATS is ridiculously competitive now) and employed (samesies).

Has anyone look across other categories (e.g. potential other sciences) to see how this problem has been solved? I assume at the most macro level it is going to be "Industry" vs "Government" but I'm looking for more concrete interventions.

Thoughts?

*we're turning away very smart, motivated, well-meaning and skilled people. This is bad.

Dagon33m20

This seems to be asking from the demand side ("we" being people with lots of money who want to hire trained people), but then switches to demand side (people being turned away looking for training and employment).

I think that's a hint to your answer: other industries solve it by actually hiring lots of people, and offering training on the job or with regular programs. Oh, and usually waiting for equilibrium to catch up, which is not comfortable for rapid-change requirements.

MIRI 2024 Communications Strategy

225

Gretta Duleba

As we explained in our MIRI 2024 Mission and Strategy update, MIRI has pivoted to prioritize policy, communications, and technical governance research over technical alignment research. This follow-up post goes into detail about our communications strategy.

The Objective: Shut it Down^[1]

Our objective is to convince major powers to shut down the development of frontier AI systems worldwide before it is too late. We believe that nothing less than this will prevent future misaligned smarter-than-human AI systems from destroying humanity. Persuading governments worldwide to take sufficiently drastic action will not be easy, but we believe this is the most viable path.

Policymakers deal mostly in compromise: they form coalitions by giving a little here to gain a little somewhere else. We are concerned that most legislation intended to keep humanity alive will go...

(Continue Reading – 1970 more words)

3ryan_greenblatt1h

Hmm, I'm not sure I exactly buy this. I think you should probably follow something like onion honesty which can involve intentionally simplifying your message to something you expect will give the audience more true views. I think you should lean on the side of stating things, but still, sometimes stating a thing which is true can be clearly distracting and confusing and thus you shouldn't.

Raemon34m40

Man I just want to say I appreciate you following up on each subthread and noting where you agree/disagree, it feels earnestly truthseeky to me.

6Matthew Barnett1h

I appreciate the straightforward and honest nature of this communication strategy, in the sense of "telling it like it is" and not hiding behind obscure or vague language. In that same spirit, I'll provide my brief, yet similarly straightforward reaction to this announcement: 1. I think MIRI is incorrect in their assessment of the likelihood of human extinction from AI. As per their messaging, several people at MIRI seem to believe that doom is >80% likely in the 21st century (conditional on no global pause) whereas I think it's more like <20%. 2. MIRI's arguments for doom are often difficult to pin down, given the informal nature of their arguments, and in part due to their reliance on analogies, metaphors, and vague supporting claims instead of concrete empirically verifiable models. Consequently, I find it challenging to respond to MIRI's arguments precisely. The fact that they want to essentially shut down the field of AI based on these largely informal arguments seems premature to me. 3. MIRI researchers rarely provide any novel predictions about what will happen before AI doom, making their theories of doom appear unfalsifiable. This frustrates me. Given a low prior probability of doom as apparent from the empirical track record of technological progress, I think we should generally be skeptical of purely theoretical arguments, especially if they are vague and make no novel, verifiable predictions prior to doom. 4. Separately from the previous two points, MIRI's current most prominent arguments for doom seem very weak to me. Their broad model of doom appears to be something like the following (although they would almost certainly object to the minutia of how I have written it here): (1) At some point in the future, a powerful AGI will be created. This AGI will be qualitatively distinct from previous, more narrow AIs. Unlike terms such as "the economy", "GPT-4", or "Microsoft", this AGI is not a mere collection of entities or tools integrated into

4ryan_greenblatt1h

I basically agree with your overall comment, but I'd like to push back in one spot: From my understanding, for at least Nate Soares, he claims his internal case for >80% doom is disjunctive and doesn't route all through 1, 2, 3, and 4. I don't really know exactly what the disjuncts are, so this doesn't really help and I overall agree that MIRI does make "sweeping claims with high confidence".

Foresight Vision Weekend Europe 2024

Jul 12thSchloßplatz 1, Bückeburg

Allison Duettmann

You are invited to join Vision Weekend Europe, the annual festival of Foresight Institute at Bückeburg Castle in Germany from July 12 - 14.

What’s this year’s theme? This year’s main conference track is dedicated to “Paths to Progress”; meaning you will hear 10+ invited presentations from Foresight’s core community highlighting paths to progress in the following areas:

Long-term History & Flourishing Futures
Longevity, Rejuvenation, Cryonics
Molecular Machines, Computing, APM
Neurotech, BCIs & WBEs
Cryptography, Security & AI
Energy, Space, Expansion
Funding, Innovation, Progress

Confirmed presenters include Jaan Tallinn (Future of Life Institute), Hendrik Dietz (Dietz Lab), Anders Sandberg (University of Oxford), Catalin Mitelut (NYU), Muriel Richard-Noca (ClearSpace), Nikolina Lauc (GlycanAge), Andrew Critch (Encultured), Joao Pedro De Magalhaes (University of Birmingham), Jeremy Barton, Toby Pilditch (Transformative Futures Institute), Matjaz Leonardis (Oxford University), Trent McConaghy (Ocean Protocol), Chiara...

(See More – 81 more words)

Non-Disparagement Canaries for OpenAI

aysja, Adam Scholl

Since at least 2017, OpenAI has asked departing employees to sign offboarding agreements which legally bind them to permanently—that is, for the rest of their lives—refrain from criticizing OpenAI, or from otherwise taking any actions which might damage its finances or reputation.^[1]

If they refused to sign, OpenAI threatened to take back (or make unsellable) all of their already-vested equity—a huge portion of their overall compensation, which often amounted to millions of dollars. Given this immense pressure, it seems likely that most employees signed.

If they did sign, they became personally liable forevermore for any financial or reputational harm they later caused. This liability was unbounded, so had the potential to be financially ruinous—if, say, they later wrote a blog post critical of OpenAI, they might in principle be...

(See More – 527 more words)

Neel Nanda1h20

Geoffrey Irving (Research Director, AI Safety Institute)

Given the tweet thread Geoffrey wrote during the board drama, it seems pretty clear that he's willing to publicly disparage OpenAI. (I used to work with Geoffrey, but have no private info here)

8Garrett Baker3h

A market on the subject: https://manifold.markets/GarrettBaker/which-of-the-names-below-will-i-rec?r=R2FycmV0dEJha2Vy

1quila3h

I'm a different person but I would support contracts which disallow spread of capabilities insights, but not contracts which disallow criticism of AI orgs (and especially not surprise ones). IIUC the latter is what what the OAI-NonDisparagement-controversy has been about. I'm not confident the following is true, but it seems to me that your first question was written under a belief that the controversy was about both of those at once. It seems like it was trying (under that world model) to 'axiomatically' elicit a belief in disagreement with an ongoing controversy, which would be non-truthseeking.

2the gears to ascension3h

That seems like a misgeneralization, and I'd like to hear what thoughts you'd have depending on the various answers that could be given in the framework you raise. I'd imagine that there are a wide variety of possible ways a person could be limited in what they choose to say, and being threatened if they say things is a different situation than if they voluntarily do not: for example, the latter allows them to criticize.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Zach Stein-Perlman's Shortform

Zach Stein-Perlman

Zach Stein-Perlman1h20

Labs should give deeper model access to independent safety researchers (to boost their research)

Sharing deeper access helps safety researchers who work with frontier models, obviously.

Some kinds of deep model access:

Helpful-only version
Fine-tuning permission
Activations and logits access
[speculative] Interpretability researchers send code to the lab; the lab runs the code on the model; the lab sends back the results

See Shevlane 2022 and Bucknall and Trager 2023.

A lab is disincentivized from sharing deep model access because it doesn't want headlines about h... (read more)

The 27 papers

EZ97

16h

List of 27 papers (supposedly) given to John Carmack by Ilya Sutskever: "If you really learn all of these, you’ll know 90% of what matters today."
The list has been floating around for a few weeks on Twitter/LinkedIn. I figure some might have missed it so here you go.
Regardless of the veracity of the tale I am still finding it valuable.

https://punkx.org/jackdoe/30.html

The Annotated Transformer (nlp.seas.harvard.edu)
The First Law of Complexodynamics (scottaaronson.blog)
The Unreasonable Effectiveness of RNNs (karpathy.github.io)
Understanding LSTM Networks (colah.github.io)
Recurrent Neural Network Regularization (arxiv.org)
Keeping Neural Networks Simple by Minimizing the Description Length of the Weights (cs.toronto.edu)
Pointer Networks (arxiv.org)
ImageNet Classification with Deep CNNs (proceedings.neurips.cc)
Order Matters: Sequence to sequence for sets (arxiv.org)
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism (arxiv.org)
Deep Residual Learning for Image Recognition (arxiv.org)
Multi-Scale Context Aggregation by Dilated Convolutions

...

(See More – 107 more words)

metachirality1h30

I like this format and framing of "90% of what matters" and someone should try doing it with other subjects.

4Amalthea15h

Might be good to estimate the date of the recommendation - as the interview where Car mack mentioned this was in 2023, a rough guess might be 2021/22?

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Notes on Helen Toner’s TED AI Show Podcast

The Objective: Shut it Down^[1]

LessOnline Festival

Quick Takes

Popular Comments

Recent Discussion

Notes on Helen Toner’s TED AI Show Podcast

The Objective: Shut it Down[1]

LessOnline Festival

The Objective: Shut it Down^[1]