Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

[-]Marius Hobbhahn3y6-1

I'm not sure I actually understand the distinction between forecasting and foresight. For me, most of the problems you describe sound either like forecasting questions or AI strategy questions that rely on some forecast.

Your two arguments for why foresight is different than forecasting are
a) some people think forecasting means only long-term predictions and some people think it means only short-term predictions.
My understanding of forecasting is that it is not time-dependent, e.g. I can make forecasts about an hour from now or for a million years from now. This is also how I perceive the EA /AGI-risk community to use that term.

b) foresight looks at the cone of outcomes, not just one.
My understanding of forecasting is that you would optimally want to predict a distribution of outcomes, i.e. the cone but weighted with probabilities. This seems strictly better than predicting the cone without probabilities since probabilities allow you to prioritize between scenarios.

I understand some of the problems you describe, e.g. that people might be missing parts of the distribution when they make predictions and they should spread them wider but I think you can describe these problems entirely within the forecasting language and there is no need to introduce a new term.

LMK if I understood the article and your concerns correctly :)

[-]jacquesthibs3y71

So, my difficulty is that my experience in government and my experience in EA-adjacent spaces has totally confused my understanding of the jargon. I'll try to clarify:

In the context of my government experience, forecasting is explicitly trying to predict what will happen based on past data. It does not fully account for fundamental assumptions that might break due to advances in a field, changes in geopolitics, etc. Forecasts are typically used to inform one decision. It does not focus on being robust across potential futures or try to identify opportunities we can take to change the future.
In EA / AGI Risk, it seems that people are using "forecasting" to mean something somewhat like foresight, but not really? Like, if you go on Metaculus, they are making long-term forecasts in a superforecaster-mindset, but are perhaps expecting their long-term forecasts are as good as the short-term forecasts. I don't mean to sound harsh, it's useful what they are doing and can still feed into a robust plan for different scenarios. However, I'd say what is mentioned in reports typically does lean a bit more into (what I'd consider) foresight territory sometimes.
My hope: instead of only using "forecasts/foresight" to figure out when AGI will happen, we use it to identify risks for the community, potential yellow/red light signals, and golden opportunities where we can effectively implement policies/regulations. In my opinion, using a "strategic foresight" approach enables us to be a lot more prepared for different scenarios (and might even have identified a risk like SBF much sooner).

My understanding of forecasting is that you would optimally want to predict a distribution of outcomes, i.e. the cone but weighted with probabilities. This seems strictly better than predicting the cone without probabilities since probabilities allow you to prioritize between scenarios.

Yes, in the end, we still need to prioritize based on the plausibility of a scenario.

I understand some of the problems you describe, e.g. that people might be missing parts of the distribution when they make predictions and they should spread them wider but I think you can describe these problems entirely within the forecasting language and there is no need to introduce a new term.

Yeah, I care much less about the term/jargon than the approach. In other words, what I'm hoping to see more of is to come up with a set of scenarios and forecasting across the cone of plausibility (weighted by probability, impact, etc) so that we can create a robust plan and identify opportunities that improve our odds of success.

[-]Marius Hobbhahn3y20

thanks for clarifying

[-]Jelle Donders2y30

Thanks for this post! I've been thinking a lot about AI governance strategies and their robustness/tractability lately, much of which feels like a close match to what you've written here.

For many AI governance strategies, I think we are more clueless than many seem to assume about whether a strategy ends up positively shaping the development of AI or backfiring in some foreseen or unforeseen way. There are many crucial considerations for AI governance strategies, miss or get one wrong and the whole strategy can fall apart, or become actively counterproductive. What I've been trying to do is:

Draft a list of trajectories for how the development and governance of AI up until we get to TAI, estimating the likelihood and associated xrisk from AI for each trajectory.
- e.g. "There ends up being no meaningful international agreements or harsh regulation and labs race each other until TAI. Probability of trajectory: 10%, Xrisk from AI for scenario: 20%."
Draft a list of AI governance strategies that can be pursued.
- e.g. "push for slowing down frontier AI development by licensing the development of large models above a compute threshold and putting significant regulatory burden on them".
For each combination of trajectory and strategy, assess whether we are clueless about the what the sign before the impact of said strategy would be, or if the strategy would be robustly good (~predictably lowers xrisk from AI in expectation), at least for this trajectory. A third option would of course be robustly bad.
- e.g. "Clueless, it's not clear which consideration should have more weight, and backfiring could be as bad as success is good.
  - + This strategy would make this trajectory less likely and possibly shift it to a trajectory with lower xrisk from AI.
  - - Getting proper international agreement seems unlikely for this pessimistic trajectory. Partial regulation could disproportionally slow down good actors, or lead to open source proliferation and increases misuse risk."
Try to identify strategies that are robust across a wide array of trajectories.

I'm just winging it without much background in how such foresight-related work is normally done, so any thoughts or feedback on how to approach this kind of investigation, or what existing foresight frameworks you think would be particularly helpful here are very much appreciated!

[-]jacquesthibs2y20

Any thoughts or feedback on how to approach this kind of investigation, or what existing foresight frameworks you think would be particularly helpful here are very much appreciated!

As I mentioned in the post, I think the Canadian and Singapore governments are both the best governments in this space, to my knowledge.

Fortunately, some organizations have created rigorous foresight methods. The top contenders I came across were Policy Horizons Canada within the Canadian Federal Government and the Centre for Strategic Futures within the Singaporean Government.

As part of this kind of work, you want to be doing scenario planning multiple levels down. How does AI interact with VR? Once you have that, how does it interact with security and defence? How does this impact offensive work? What are the geopolitical factors that work their way in? Does public sentiment through job loss impact the development of these technologies in some specific ways? For example, you might have more powerful pushback from industries with more distinguished, intelligent, heavily regulated industries with strong union support.

Aside from that, you might want to reach out to the Foresight Institute, though I'm a bit more skeptical that their methodology will help here (though I'm less familiar with it and like the organizers overall).

I also think that looking at the Malicious AI Report from a few years ago for some inspiration would be helpful, particularly because they held a workshop with people of different backgrounds. There might be some better, more recent work I'm unaware of.

Additionally, I'd like to believe that this post was a precursor to Vitalik's post on d/acc (defensive accelerationism), so I'd encourage you to look at that.

Another thing to look into are companies that are in the cybersecurity space. I think we'll be getting more AI Safety pilled orgs in this area soon. Lekara is an example of this, I met two employees and they essentially told me that the vision is to embed themselves into companies and then continue to figure out how to make AI safer and the world more robust once they are in that position.

There are also more organizations popping up, like the Center for AI Policy, and my understanding is that Cate Hall is starting an org that focuses on sensemaking (and grantmaking) for AI Safety.

If you or anyone is interested in continuing this kind of work, send me a DM. I'd be happy to help provide guidance in the best way I can.

Lastly, I will note that I think people have generally avoided this kind of work because "if you have a misaligned AGI, well, you are dead no matter how robust you make the world or wtv you plan around it." I think this view is misguided and I think you can potentially make our situation a lot better by doing this kind of work. I think recent discussions on AI Control (rather than Alignment) are useful in questioning previous assumptions.

[-]Darren McKee3y32

Great post! I definitely think that the use of strategic foresight is one of the many tools we should be applying to the problem.

LESSWRONG
LW

LESSWRONG
LW

28

Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

28

28

What is strategic foresight?

Unfortunately, foresight can be poorly implemented

Strategic Foresight applied to AGI risk

Why should we be doing Strategic Foresight for AGI risk?

How can we adapt known strategic foresight methods for the AGI risk?

Call to action