In no particular order, here's a collection of Twitter screenshots of people attacking AI Safety. A lot of them are poorly reasoned, and some of them are simply ad-hominem. Still, these types of tweets are influential, and are widely circulated among AI capabilities researchers.






  (That one wasn't actually a critique, but it did convey useful information about the state of AI Safety's optics.)

















I originally intended to end this post with a call to action, but we mustn't propose solutions immediately. In lieu of a specific proposal, I ask you, can the optics of AI safety be improved?

10 comments, sorted by Click to highlight new comments since: Today at 11:36 PM
New Comment

These... don't seem that bad? I mean, given that they were selected for both (a) criticism and (b) being on twitter. Like, by twitter standards, if these examples are the typical case, it seems indicative of very unusually good PR for alignment.

It’s reassuring to see we’re somewhere between the “then they laugh at you” and “then they fight you” stage. I thought we were still mostly at “first they ignore you.”

I agree with many of those tweets. Many of them had actual good points.

I just learned I'm a based effective accelerationist.

I'm not sure it's productive to engage with this stuff. Taking a GRAND STAND may feel good, but in many cases people end up becoming useful foils. Block liberally, don't engage, focus on what actually matters. 

I'm not necessarily advocating for direct engagement! If engagement with this stuff won't decrease AI risk, then I don't want to engage. If it does, then I do. Some of these people/orgs are influential (Venkatesh Rao, HuggingFace), so unfortunately, their opinions do actually matter. As nice as it would feel to ignore the haters, public opinion is in fact a strategic asset when it comes to actually implementing AI safety proposals at major labs.

Some of these people/orgs are influential (Venkatesh Rao, HuggingFace), so unfortunately, their opinions do actually matter.

Do you have any evidence that Venkatesh Rao is influential? I've never seen him quoted by anyone outside the rationality community.

I would expect you to be be able to find these tweets, and hundreds more like them no matter how good alignment optics is. A lot of people use Twitter, and I could probably find similar tweets about Mother Theresa or Princess Diana. As such showing this doesn't actually tell us all that much TBH.

Practice rationalism on this.  What predictions do you make, and what conditional predictions on whatever actions you're advocating?  It feels a little like you're getting sucked into a status game by caring very much about who's saying what, rather than steelmanning the critiques and deciding if members of the EA community (disclosure: I am not one - I'm not part of sneerclub, but I do see the cult-like aspects of the bay-area subculture) should do anything differently.  As in, should you behave differently, separately from should you participate in the signaling and public conversations around this kind of thing for status purposes?

Note also that the criticism is not purely wrong.  "Revealed preferences say a lot" is a pretty compelling point.

I wonder what I expected to get out of this post - after all, I already don't use Twitter.