avatar

posted on 2023-12-12 — also cross-posted on lesswrong, see there for comments

Some biases and selection effects in AI risk discourse

These are some selection effects impacting what ideas people tend to get exposed to and what they'll end up believing, in ways that make the overall epistemics worse. These have mostly occured to me about AI discourse (alignment research, governance, etc), mostly on LessWrong. (They might not be exclusive to discourse on AI risk.)

(EDIT: I've reordered the sections in this post so that less people get stuck on what was the first section and so they a better chance of reading the other two sections.)

Outside-view is overrated

In AI discourse, outside-view (basing one's opinion on other people's and on (things that seem like) precedents), as opposed to inside-view (having an actual gears-level understanding of how things work), is being quite overrated for a variety of reasons.

Arguments about P(doom) are filtered for nonhazardousness

Some of the best arguments for high P(doom) / short timelines that someone could make would look like this:

It's not that hard to build an AI that kills everyone: you just need to solve [some problems] and combine the solutions. Considering how easy it is compared to what you thought, you should increase your P(doom) / shorten your timelines.

But obviously, if people had arguments of this shape, they wouldn't mention them, because they make it easier for someone to build an AI that kills everyone. This is great! Carefulness about exfohazards is better than the alternative here.

But people who strongly rely on outside-view for their P(doom) / timelines should be aware that their arguments are being filtered for nonhazardousness. Note that this plausibly applies to other topics than P(doom) / timelines.

Note that beyond not-being-mentioned, such arguments are also anthropically filtered against: in worlds where such arguments have been out there for longer, we died a lot quicker, so we're not there to observe those arguments having been made.

Confusion about the problem often leads to useless research

People enter AI risk discourse with various confusions, such as:

Those questions about the problem do not particularly need fancy research to be resolved; they're either already solved or there's a good reason why thinking about them is not useful to the solution. For these examples:

These answers (or reasons-why-answering-is-not-useful) usually make sense if you're familiar with rationality and alignment, but some people are still missing a lot of the basics of rationality and alignment, and by repeatedly voicing these confusions they cause people to think that those confusions are relevant and should be researched, causing lots of wasted time.

It should also be noted that some things are correct to be confused about. If you're researching a correlation or concept-generalization which doesn't actually exist in the territory, you're bound to get pretty confused! If you notice you're confused, ask yourself whether the question is even coherent/true, and ask yourself whether figuring it out helps save the world.

posted on 2023-12-12 — also cross-posted on lesswrong, see there for comments

unless explicitely mentioned, all content on this site was created by me; not by others nor AI.