Navigating public AI x-risk hype while pursuing technical solutions

Dan Braun

Public attention on AI x-risk has skyrocketed. I don’t expect this to wane anytime soon. This has a few potentially negative implications for those of us pursuing technical solutions to the problem:

Research Direction Incentives

Since those in the field will no doubt read more x-risk related media and talk more about it with non-technical people (e.g. family, friends), we are more likely to be swayed into particular research directions that may not be as important. Some mechanisms for this:

There may be natural incentives to work on nearer-term AI harms when those harms may be more likely to occur to people you care about. E.g. "how can I prevent the newly deployed AI system from manipulating my friends?"
“Public person X that I respect thinks that Y is a really important problem, I will spend effort working to address Y to impress them and make a name for myself.”

Added Pressure

It’s natural for one’s internal mental model to shift from “this might be something that wipes out everyone sometime in the future” to “this might be something that wipes out everyone, including all of the identifiable people that are starting to get very concerned and talking about the risk directly”. That is, the potential suffering is becoming much more salient.
Working on technical alignment will shift (has shifted?) from being something that people don’t understand or think you’re weird for working on, to something that has a lot of external interested and expectation.

While some may thrive under this added pressure, for many it may become an extra mental burden that can hurt productivity and mental health.

Other Negative Implications

The raw time and energy taken away from reading/focusing on technical content vs AI media cycles (that will be naturally interesting to those in the field).
Typical incentive issues that come when a field grows and there is more status and money on the line.

Positive Implications

That’s not to say that someone working on technical solutions should shut out mainstream media completely. There are some upsides:

There may be insights into the real problems that are gleaned from mainstream discourse.
It is important to know how your proposed technical solutions will interact with reality if successful. E.g. If I’m working on something that might solve a good chunk of the problem but has 0 chance of being implemented by those building/managing AGI, then this direction should be downweighted. That said, I would be very careful not to underestimate how quickly the sociopolitical environment could change in the near future.

Conclusion

What I’m advocating for is merely to be extra mindful of how the increase in public attention on AI x-risk manifests in your life, and to adjust accordingly.

For what it's worth, I plan to make the following adjustments:

Read less mainstream news about AI, e.g. by not reading as many articles that friends/colleagues link. (I've managed to avoid using twitter for years, I plan to continue avoiding it.)
Customise my LessWrong newsfeed, and generally swap out LessWrong time for Alignment Forum time. Prior to ChatGPT, I felt the balance of content on lesswrong was perfect for me without any tag customisation. This balance has begun to tip more towards community and coordination pieces, and may tip further in that direction in the future. I don’t think this is necessarily a bad thing, I think coordination is a crucial element of reducing x-risk. It’s just not an area that I think should currently consume a lot of my time.^[1]
Ensure that I unplug from AI-risk spaces regularly. This will be more difficult when a larger proportion of the people I talk to want to talk about AI risk, but I can at least be mindful of when it’s happening and perhaps steer conversations elsewhere.

^{^}
I actually never realised that you could upweight how likely posts are to appear on “Latest posts” based on tags. This is a really nice feature, props to LW team. Still, I do have an inclination to read the title of every post and then read the posts that I’m immediately attracted to (not necessarily the ones that I’d endorse reading after careful reflection).

LESSWRONG
LW