This is a linkpost for https://scottaaronson.blog/?p=6524

Before June 2022 was the month of the possible start of the Second American Civil War, it was the month of a lively debate between Scott Alexander and Gary Marcus about the scaling of large language models, such as GPT-3.  Will GPT-n be able to do all the intellectual work that humans do, in the limit of large n?  If so, should we be impressed?  Terrified?  Should we dismiss these language models as mere “stochastic parrots”?

I was privileged to be part of various email exchanges about those same questions with Steven Pinker, Ernest Davis, Gary Marcus, Douglas Hofstadter, and Scott Alexander.  It’s fair to say that, overall, Pinker, Davis, Marcus, and Hofstadter were more impressed by GPT-3’s blunders, while we Scotts were more impressed by its abilities.  (On the other hand, Hofstadter, more so than Pinker, Davis, or Marcus, said that he’s terrified about how powerful GPT-like systems will become in the future.)

Anyway, at some point Pinker produced an essay setting out his thoughts, and asked whether “either of the Scotts” wanted to share it on our blogs.  Knowing an intellectual scoop when I see one, I answered that I’d be honored to host Steve’s essay—along with my response, along with Steve’s response to that.  To my delight, Steve immediately agreed.  Enjoy!  –SA

New Comment
10 comments, sorted by Click to highlight new comments since: Today at 11:35 PM

I'm a huge fan of Pinker. How The Mind Works and The Language Instinct are two of my all-time favorite books. So I'm surprised and saddened to see him engaging in this debate for years without showing a familiarity with many of the core AI concepts, such as instrumental convergence and corrigibility.

I love his books too. It's a real shame.

"...such as imagining that an intelligent tool will develop an alpha-male lust for domination."

It seems like he really hasn't understood the argument the other side is making here.

It's possible he simply hasn't read about instrumental convergence and the orthogonality thesis. What high quality widely-shared introductory resources do we have on those after all? There's Robert Miles, but you could easily miss him.

Stuart Russell in the FLI podcast debate outlined things like instrumental convergence and corrigibility, though it took a backseat to his own standard/nonstandard model approach, and challenged him to publish reasons why he's not compelled to panic in a journal, but warned him that many people would emerge to tinker with and poke holes in his models.

The main thing I remember from that debate is that Pinker thinks the AI xrisk community is needlessly projecting "will to power" (as in the nietzschean term) onto software artifacts.

"core AI concepts, such as instrumental convergence and corrigibility."

Are those concepts core to AI in general or just to the LessWrong+ version of AI?

[-][anonymous]7mo 42

They're core to the agent model of AI, with coherent preferences. Once you get coherent preferences you get utility maximization, which gets instrumental convergence, uncorrigibility, self-preservation and so on.

Questions like:

  • how likely is it we build agent AI? Either explicitly or indirectly
  • how likely is it our agent AI will have coherent preferences?

Are still open questions, and different researchers will have different opinions.

I get that. But there are lots of AI researchers who know little or nothing of discussions here. What's the likelihood that they know or care about things like instrumental convergence and corrigibility?

[-][anonymous]7mo 21

If you're studying state-of-the-art AI you don't need to know any of these topics.

Phrases like "AI safety" and "AI ethics" probably conjure up ideas closer to machine learning models with socially biased behavior, stock trading bot fiascos, and such. The Yudkowskian paradigm only applies to human-level AGI and above, which few researchers are pursuing explicitly.

Can you quote something specific where he makes a mistake?

[-][anonymous]7mo 01

Is there a longform discussion anywhere on pros and cons of nuking Russia post WW2 and establishing world govt?

It isn't as obvious to me what the correct answer was, the way it is obvious to ppl in this discussion.

Also this seems like a central clash of opinion that will resurface in the AI race, or any attempts today to reduce nuclear risk.

New to LessWrong?