Imagine the set of decisions which impact TAI outcomes.
Some of the decisions are more important than others: they have a larger impact on TAI outcomes. We care more about a decision the more important it is.
Some of the decisions are also easier to influence than others, and we care more about a decision the easier it is to influence.[1]
Decisions in the past are un-influenceable
Probably some decisions in the future are too, but it may be hard to identify which ones
How easy it is to influence a decision depends on who you are
People have access to different sets of people and resources
Influence can be more or less in/direct (e.g. how many steps removed are you from the US President)
The influenceable decisions are distributed over time, but we don’t know the distribution:
Decisions might happen earlier or later (timing/timelines)
There might be more or fewer decisions in the set
Decisions might be more or less concentrated over time (urgency/duration)
There might be discontinuities (only one decision matters, at the right of the distribution suddenly there are no more decisions because takeover has happened…)
It’s possible that in the future there might be a particularly concentrated period of important decisions, for example like this:
People refer to this concentrated period of important decisions as ‘crunch time’. The distribution might or might not end up actually containing a concentrated period of important decisions - or in other words, crunch time may or may not happen.
Where the distribution does contain a concentrated period of important decisions (or at least is sufficiently likely to in expectation), crunch time might be a useful concept to have:
It’s a flag to switch modes (e.g. aiming for direct effects rather than getting ourselves into a better position in future)[2]
It’s a way of compressing lots of information about the world (e.g. these 15 things have happened with various levels of confidence, which makes us think that these 50 things will happen soon with various probabilities)
It’s a concept which can help to coordinate lots of people, which in some crunch time scenarios could be very important
But we don’t know what the underlying distribution of decisions is. Here are just some of the ways the distribution might look:
The fact that the underlying distribution is uncertain means that there are several ways in which crunch time might be an unhelpful concept:
Different sorts of concentrated periods of important decisions might differ so much that it’s better to think about them separately.
A crunch time lasting a day seems pretty different to one lasting 5 years
A crunch time where one actor makes one decision seems pretty different to one where many thousands of actors make many thousands of decisions
The eventual distribution may not contain a single concentrated period of important influenceable decisions.
Maybe the most important decisions are in the past
Maybe there will be a gradual decline in the importance of influenceable decisions as humanity becomes increasingly disempowered
Maybe the distribution is lumpy, and there will be many discrete concentrated periods of important decisions
Distributions in different domains, or even for different individuals, might diverge sufficiently that there’s no meaningful single period of important decisions
It seems unlikely that distributions diverge wildly at the individual level, but I think it’s possible that there might be reasonably different distributions for people who work in government versus in AI labs, for example.
It might still be useful to have crunch time as a concept for a particularly concentrated period of important decisions, but only apply it in specific domains: crunch time for DC AI governance people, crunch time for people working on AI safety agenda X…
We may remain highly uncertain about the distribution of important decisions throughout the relevant period.
Worlds where your P(crunch time is happening now) jumps discretely from 5% to 65% seem pretty different from worlds where your P(crunch time is happening now) hovers at 48% indefinitely
It doesn’t really matter if there’s a concentrated period of important decisions or not if nobody knows there is with sufficient confidence
Some reasons to think we might remain highly uncertain:
Poor access to information flows (e.g. key information is private to an AI lab)
Epistemic distortions from incentives, crazy things AI is doing…
Not being clear ahead of time on what concrete signs might indicate that we’re entering a concentrated period of important decisions
People’s beliefs about the distribution of important decisions may continue to vary widely.
People have access to different information
A crunch time where only one lab knows that it’s crunch team seems pretty different to one where 5% of the educated public knows or one where the vast majority of the globe knows
Presumably people with very short timelines believe that crunch time is happening now, or has already happened
What does all of this imply?
I think it’s helpful to imagine the distribution of decisions which impact TAI outcomes, and think about crunch time in that wider context. This helps to keep in frame that a) crunch time may not happen in any sense, and b) there are lots of different forms that crunch time could take.
Some other thoughts:
It would be valuable to reduce uncertainty about the distribution of important decisions which impact TAI
There’s a way in which a lot of work on AI timelines and take-off speeds is aiming to do this
I haven’t seen any work on what concrete signs might indicate that we’re entering a concentrated period of important decisions either generally or in a given domain, and would be excited for this work to happen
Some of the things it might be important to do in preparation for a concentrated period of important decisions are pretty transferable to other distributions of decision
I’m excited about scenario planning, and about actors having plans in drawers for what they will do if various high impact things happen. Some of these scenarios might end up playing out during ‘crunch time’, or they might happen along the course of some other trajectory
Some of the things Eli suggests here seem pretty good with or without a concentrated period of important decisions, like cultivating virtue, improving your productivity, picking up new tools which are relevant to your work…
Thanks to Michael Aird, Adam Bales, Daniel Kokotajo, Jan Kulveit, Chana Messinger and Nicole Ross for variously helping me to think about this.
Another thing which matters about the decisions is how predictable they are. I haven’t gone into that in this post, so predictability only features here at all to the extent that it’s hard to influence unpredictable things.
In this post you avoided giving any concrete examples, but I wanted to brainstorm what some of the major decisions are.
The decision to run or deploy a particular model, on a particular day, in a particular way. (e.g. OpenAI released ChatGPT to the public on Nov 30, 2022.) This is a decision made by possibly a single engineer, or a team of engineers or by management.
The decision to pursue, or not pursue, a particular technology, idea or method. This is a decision made by engineers, researchers, management and grant-makers.
The decision to reveal, or not reveal, certain information to the public. The information could be source code, model weights, a whitepaper, or even the existence of a particular capability. This is a decision made by engineers, researchers and management.
The choice of a particular reward function or loss function, if that is part of the model. This is a decision made by engineers and researchers.
The choice of particular hardware, such as CPUs vs GPUs vs TPUs vs future neuromorphic hardware. Depending on your perspective, this is a decision made by researchers, by chip companies like NVidia and TSMC, by market forces (gaming & crypto), or by mother nature (some technologies are just more practical than others).
The tastes of the public & the market. For example, the public has responded strongly to AI art and chatbots in the last year, but in years past the public was not impressed enough by either technology to use them on a daily basis or consider them impactful. This is a kind of collective decision we all make, and it impacts how management makes choices. For another example, if the public strongly wanted ChatGPT to be completely uncensored and offensive, OpenAI would have made different choices when building their RLHF system.
The setting of laws and regulations related to AI. This is a decision made by politicians, clerks, lobbyists and activists.
The setting of business policy related to what the AI is "allowed" to do according to business policy. (eg ChatGPT will refuse to engage on certain topics, although this is highly hackable.) This is a decision made by management, under pressure from the public, politicians, activists, investors etc.
The setting of business policy related to who is allowed to access the AI (eg the general public) and what they are allowed to do with it.
The choice of reaction in the event that an AI is behaving badly - ie do they intervene, modify the model, shut it down, etc. This is a decision made by engineers and management, but their choices will likely be highly influenced by the alignment community, especially if there are prepared plans.
The decision to prepare a plan. Someone, perhaps from this community, might make plans in the event of certain circumstances, such as an AI that is clearly behaving badly. These plans might be helpful in an emergency. This is a decision made by the alignment community, management and researchers.
The decision by the alignment community to communicate in particular ways with particular people, eg having a long private conversation with Sam Altman, or publicly appearing on a podcast to discuss alignment. This decision will influence people's thinking, especially that of the most important decision makers.
The decision to research particular alignment concepts, eg Agent Foundations, Shard Theory, the stop button problem, etc. This is a decision made by the alignment community and researchers.
Also on EAForum here.
Imagine the set of decisions which impact TAI outcomes.
Some of the decisions are more important than others: they have a larger impact on TAI outcomes. We care more about a decision the more important it is.
Some of the decisions are also easier to influence than others, and we care more about a decision the easier it is to influence.[1]
The influenceable decisions are distributed over time, but we don’t know the distribution:
It’s possible that in the future there might be a particularly concentrated period of important decisions, for example like this:
People refer to this concentrated period of important decisions as ‘crunch time’. The distribution might or might not end up actually containing a concentrated period of important decisions - or in other words, crunch time may or may not happen.
Where the distribution does contain a concentrated period of important decisions (or at least is sufficiently likely to in expectation), crunch time might be a useful concept to have:
But we don’t know what the underlying distribution of decisions is. Here are just some of the ways the distribution might look:
The fact that the underlying distribution is uncertain means that there are several ways in which crunch time might be an unhelpful concept:
What does all of this imply?
I think it’s helpful to imagine the distribution of decisions which impact TAI outcomes, and think about crunch time in that wider context. This helps to keep in frame that a) crunch time may not happen in any sense, and b) there are lots of different forms that crunch time could take.
Some other thoughts:
Thanks to Michael Aird, Adam Bales, Daniel Kokotajo, Jan Kulveit, Chana Messinger and Nicole Ross for variously helping me to think about this.
Another thing which matters about the decisions is how predictable they are. I haven’t gone into that in this post, so predictability only features here at all to the extent that it’s hard to influence unpredictable things.
I like this post, which defines crunch time as “The period where it's relatively more important to optimize for direct effects rather than P2B.”