Reasons to sign a statement to ban superintelligence (+ FAQ for those on the fence)

Ishual

I don't know the methodology behind how the statement you have has been drafted, I don't think you even mention the statement directly in this post, but I will say that I think the correct methodology here is not to first come up with a statement, and then email it out and ask a bunch of researchers whether they'd sign it. That is a recipe for exactly what you're encountering here. The researchers you're reaching out to have a very different perspective on public communication and how they'd like to represent their beliefs than you do, in your role as de-facto public relations managers in this project.

That is a good thing! We would like our researchers to primarily be thinking on simulacra level 1, and that means they will sign very different statements than what may seem optimal from the media's perspective.

However, as you point out, it can also be a bad thing, and decrease the PR manager's ability to... well... manage.

That is why I believe the solution is to first email out the main researchers who you'd like to sign the statement, and ask what sorts of statements they would be willing to sign. What properties must the statement have in order for them to be comfortable putting their name below it. Then you create a statement which you know will have at least some base of support among researchers. You should expect a reasonable amount of iteration & compromise here, and not to, on your first try, create a statement for which everyone you want to sign signs.

I will say also that it seems likely this is how the CAIS statement was drafted too. They spent (if I remember right) quite a while work-shopping it. That statement (to my knowledge) did not just appear out of the aether in its current state. It took work and compromise to get it.

Again, I don't know whether you actually fell for the trap I mention, but it seems likely given the pushback you're getting

[-]MichaelDickens1h40

As I understand, this is how scientific bodies' position statements get written. Scientists do not universally agree about the facts in their field, but they iterate on the statement until none of the signatories have any major objections.

[-]Vladimir_Nesov2h40

I wouldn't support a "permanent ban" / no such thing as a permanent ban

This calls for allying with the people actually endorsing a permanent ban (who do exist), even as you might consider a permanent ban impossible. Many people consider a temporary ban similarly impossible.

It does seem inconvenient to add some sort of "until it's actually safe to proceed" if it's a single sentence. There's also the option of framing this as a Pause rather than a ban, though in that case it needs to be made clear that the Pause is not about time.

Lucius Bushnaq's "some form of global ban or pause" seems to work for both purposes, ambiguity between a ban and a pause clarifies both that the pause could be indefinite and that the ban could be temporary.

^{^}

Researchers who have enough experience with the various problems of making AI safe are our primary audience here, but despite our previous post, we are now addressing all such researchers whatever they happen to be doing now, and indeed whether or not they have already taken a public stance.

^{^}

Policymakers and the public won't reconstruct expert sentiment from forum posts. And even if one/some of them did, they wouldn't have a concise summary of the action-relevant intersection of beliefs of a plurality of relevant researchers.

^{^}

Mechanistically, they mostly just don’t think about this fact explicitly and they have various heuristics in play that result in not engaging with the broader discussion, or of not feeling like they have to do anything outside their autopilot on the issue. If they get concerned, they might want to spend a bit more money on some charismatic expert working on a technical solution. Since there is limited time to cross the gap between most people’s intuitions and a sane view of the race toward superintelligence, the sheer implausibility of the true fact that more than a few cranks believe this ends up quite costly. We could (sort of) model this cost (either the time to actually make the fact seem less implausible with some tiny; bits of public evidence or the drag on the whole discussion and the person having it with them) as “this person taking your failure to make this fact legible as evidence of the fact not being true.

^{^}

The sentence “the sky is blue” is not strictly true, but it does stand for a statement that eg the sky isn’t more green than blue, and if we lived in a world where many powerful and wealthy forces were trying to prevent the person on the street from understanding that the sky was blue (by claiming it was green), then I guess I’d say it is “true enough”, and being silent about it (or just not achieving coordination among experts to say a simple sentence that makes sense to people) is “false enough”.

^{^}

Unless some force is somehow preventing you from doing so somehow.

^{^}

Even somewhat competitive with much more costly efforts to make your opinion public, definitely in expectation better than going on a podcast that 10K people will hear.

^{^}

If you are so worried about us finding out that no one wants to exit the house (despite you and many others actually secretly wanting to get out), you can just say you don’t think we should make the vote public unless the result is non-embarrassing (or you can just vote and not care much about the outcome). Pluralistic ignorance is real: a majority can believe X while mistakenly thinking everyone else believes "not X," leading to collective silence. You don't need to assess this alone, that's precisely what collective action helps reveal. Even if support seems limited now, coordination can change the landscape.

^{^}

Unless you are really loud and constantly repeating your central position in a way that actually reaches lots of people, so many that your signature on the statement would not have significant additional impact, in which case carry on, you are outside the target audience :)

^{^}

Moreover, consider a reversal test: if someone compiled a list of people who've expressed support for a ban based on public statements, would you want your name removed? If not, then making that support explicit seems consistent with your actual position.

^{^}

In terms of time, effort, not necessarily monetary costs.

^{^}

Likewise, this is a good reason to keep our statement relatively simple.

^{^}

Let's assume that there is some chance that 100-300 people sign. This would have very large impacts. The impact is not confined to the moment one signs or to the moment when the statement goes public. Once the statement becomes a short conversational move, it will be used in conversations about AI-caused X-risks many times, and then these signatures will boost the effectiveness of these conversations. The CAIS statement is likely the single most effective sentence when I (Ishual) speak to people about AI-caused X-risks.

[made up numbers warning:] To compare the impact of a single signature to some other intervention, we naively just divide by 100-300, and still get some quite large impact. Even if there isn't much difference in effectiveness between 100 and 300 signatures, there is still plenty of marginal impact of a single signature, if we wish that the sum of marginal expected utils equal the expected utils of the whole, because early signatures also makes it easier for others to sign. If we kept insisting, we might naively expect that the distribution of outcomes would be "bimodal" between less than ~100 and more than ~300, given the proportion of "agree and will sign" to "agree and won't sign," and naively assuming there are only really 300 serious experts.

^{^}

An alternative way to do it would be major public outreach (books, podcasts, etc.), though that's far costlier.

^{^}

A policy maker once noted to me (Ishual) that the CAIS statement could be read ambiguously as it didn't clearly signal that experts favor international cooperation to prevent rogue actors from building extinction-causing superintelligence. Indeed one plausible reading of the CAIS statement is that experts want more funding to work on their thing, or simply that this is why they do what they do, which is totally gonna mitigate those risks.

^{^}

It would be strange to “blame” a class of people with the membership to the class being that they are considered experts on a topic, but nevertheless regarding that class of people, we think silence is quite bad, public-only stance is good, and the efforts you put into making important stuff legible is very good (bonus points). Assuming you already take a public stance, you are (merely) leaving lots of extra value on the table if you don’t make sufficient efforts to achieve legibility.

^{^}

either because the problem would remain intractable forever or because we'd be wiser to first achieve a good future some other way and then revisit the problem

^{^}

Seems like there is a small but extremely passionate group of people who really want this tech even now when it would be foolish to build.

^{^}

Maybe. Maybe bans are super sticky even in futures that get their shit together enough to “solve alignment” in which case my “true objection” would be that on the margin you’d have to delay ASI a lot in order to be worth even 1% more risk of actual extinction (and also squandering of the lightcone).

^{^}

This list was deliberately made so as to evoke negative, mixed, and positive feelings in a large fraction of the article's intended audience.

^{^}

Either attacking their own reputation, or that of the whole "safety community".

^{^}

Or in some cases, that the danger is non-existent.

^{^}

Or in some cases, a picture in opposition to reality

^{^}

We encounter various reactions that amount to doing nothing. You might not see a catastrophe now, but many people do see e.g. Trump shenanigans as a clear sign that AI will not be handled well. But then they just convince themselves that lying down and dying is the totality of their options.

^{^}

A humanity that works a lot more for the benefit of humans is possible. Indeed, actually making huge progress here seems much easier than creating a superintelligence that deeply cares for us the way we'd want it to care for us on the first try. So much needs to be said about this and yet it will have to wait for another post.

LESSWRONG
LW

LESSWRONG
LW

54

Reasons to sign a statement to ban superintelligence (+ FAQ for those on the fence)

54

54

The reasons we would like you to sign the statement expressing support for banning superintelligence

A positive vision

Reasons given for not signing despite agreeing with the statement

I already am taking a public stance, why endorse a single sentence summary?

I am not already taking a public stance, so why endorse a one-sentence summary?

The statement uses an ambiguous term X

I would prefer a different (e.g., more accurate, epistemically rigorous, better at stimulating good thinking) way of stating my position on this issue

The statement does not accurately capture my views, even though I strongly agree with its core

I’d be on board if it also mentioned My Thing

Taking a position on policy stuff is a different realm, and it takes more deliberation than just stating my opinion on facts

I wouldn't support a "permanent ban"

The statement doesn't include a clear mechanism to lift the ban

Superintelligence might be "too good to pass up"

I don't want to put myself out there

I am not really an expert

The safety community has "limited political capital"

We must wait until a catastrophe before spending "limited political capital"

Any other objections we missed? (and a hope for a better world)