No Summer Harvest: Why AI Development Won't Pause

Stephen Fowler

Cleo Nardo recently highlighted Yudkowsky and others discussing an "AI Summer Harvest" (or simply "Summer Harvest"). This potential near-future era would involve a pause in research on state-of-the-art (SOTA) models, enabling us to fully enjoy the economic benefits of the extraordinary technology we currently possess. There would be a widespread adoption of contemporary AI systems in conjunction with an enforced moratorium against training more advanced models.

The Summer Harvest would by a serene era in which humanity reaps the rewards of current AI advancements without delving into uncharted risks.

But There Will Be No Summer Harvest.

Thanks to Justis for extensive editing and feedback.

Simberg, Hugo. *The Garden Of Death*. 1896. [Watercolor and goauche]

Epistemics:
The following argument heavily relies on "aggressive" Fermi estimates and a lot of assumptions. I have tried to err on the side of caution, making the numbers less favorable to my argument.
I have no expertise in economics or international policy.

The Argument

I will argue that it is very unlikely that a pause on AI development longer than 6 months occurs. First I will establish that numerous private companies have a strong incentive to train new and more powerful models. Then in Premise 4 I will demonstrate that the international cooperation required to stop this from happening is very unlikely to occur.

If you're skimming through the premises you'll find the numerical estimates I am using in bold. Each section is relatively self contained and you can skip to the part of the argument that interests you the most.

(Premise 1) SOTA models will be relatively 'cheap' to produce
(Premise 2) The economic benefit of training the next SOTA system would be substantial
(Intermediate Conclusion) There will be a financial incentive of the order of billions of dollars to train the next SOTA model and end the Summer Harvest
(Premise 3) There are groups motivated by financial incentives who are both capable of training SOTA systems and would be willing to ignore the potential dangers
(Premise 4) A strongly enforced, global political agreement to prevent the training of SOTA models is unlikely
(Conclusion) There Will Be No Summer Harvest

Premise 1:
SOTA models will be relatively "cheap" to produce

I assume that the next SOTA model's cost within an order of magnitude of "close" of GPT4. The exact cost of GPT4 doesn't appear to be available, so I will defer to manifold markets who put it at an 80% chance of being greater than $50 million. I have decided to err on the side of caution and assume that it's 10x more than that. That puts the number at 500 million USD. Add 100 million for staff pay (50 people making 1 million each for two years), and the total estimated cost is 600 million USD.

Lets round it up to a cool 1 billion USD.

Update (09 April 2023): Anthropic puts spending to build Claude-Next at 1 billion dollars.

Premise 2:
The economic benefit of training the next SOTA system would be substantial

I estimate the global economic benefit of training the next SOTA model by assuming it is captured by the increase in the Gross Domestic Product (GDP) of the United States. We will also only estimate the benefit for a single year.

I make a lot of assumptions here so I will provide two estimates and pick the lowest.

First Estimate:
Assume that a new SOTA will only be able to boost productivity across the services industry by a measly 3%. This will be via completely replacing some white collar workers, and partially automating management tasks. Lets further restrict ourselves to just the United States. The services industry comprised 77.6% of the US GDP in 2021. With the GDP at 21 trillion USD that year, a 3% increase in the productivity of the service industry alone increases GDP by ~500 billion USD.

Second Estimate:
I consult the working paper "GPTs are GPTs: An Early Look At The Labor Market Impact Potential of Large Language Models" (Eloundou et al., 2023). First they break up jobs into the tasks that comprise them. A task is "exposed" if the time it takes to complete is halved by the worker having access to a contemporary LLM. They estimate that 80% of the US workforce have 10% of their tasks exposed, and 19% of the workforce has at least 50% of their work tasks exposed.

Assume that only half of the most exposed workers (the 19%) have a GPT-like model integrated into their job. Assume that a worker who completes their tasks in half the time is able to immediately translate that into twice the amount of work (this is a very strong assumption). This would mean 10% of the workforce doubles their productivity for half of their activities. An increase of 50%. Pretending that every job and every task uniformly contributes to the GDP, this would be an increase to the GDP of 5%. Using the GDP figure from 2021, this method of estimation gives us a value of ~1 trillion in economic growth.

Going with the lowest estimate, I put the overall economic impact of training and deploying new SOTA model at 500 billion USD in just one year.

Intermediate Conclusion:
There will be a financial incentive of the order of billions of dollars to train the next SOTA model and end the Summer Harvest

I estimated the increase in productivity to the American economy alone is 500 billion USD (premise 2), and the cost to produce the next SOTA model was 'only' 1 billion (premise 1). Even if you could only extract a fraction (1%) of the increased productivity, you would stand to make 4 billion dollars in a single year. I emphasise that this is likely an underestimate. This is a strong financial incentive to build the next SOTA model.

Lets double check this figure by putting it in context. Microsoft values each percentage point of the global search market at $2 billion. If training a new SOTA model enabled a company to gain just a fraction of the global search market and the company used it in no other applications, there would still be a strong financial incentive to train that model. (It is worth noting that our double check ignores inference costs.)

Figure 1 in "GPTs and GPTs" (Eloundou et al., 2023). While estimating the exact translation of enhanced AI capabilities to increased economic output is challenging, groups with the ability to train next-generation SOTA models would likely anticipate performance improvements would lead to greater market competitiveness.

Premise 3:
There are groups motivated by financial incentives who are both capable of training SOTA systems and willing to ignore the potential dangers

From above, I have assumed the training cost will be 1 billion USD, so the groups we are talking about are either large companies or governments. It is not difficult to find examples of companies making reckless, unethical and damaging decisions driven by financial incentives. Here are the first few that sprung to mind.

Exxon infamously chose to suppress their own research on the dangers of climate change in the late 1970s and early 1980s.
Numerous companies ignored signs that leaded gasoline was dangerous and the introduction of the product resulted in half the US adult population being exposed to lead during childhood. Here is a paper that claims American adults born between 1966 to 1970 lost an average of 5.9 IQ points (McFarland et al., 2022, bottom of page 3)
Meta's engagement algorithm is alleged to have driven the spread of anti-Rohingya content in Myanmar and contributed to genocide.
IBM supported its German subsidiary company Dehomag throughout WWII. When the Nazis carried out the 1939 census, used to identify people with Jewish ancestry, they utilized the Dehomag D11, with "IBM" etched on the front. Later, Concentration camps would use Dehomag machines to manage data related to prisoners, resources and labor within the camps. The number tattooed onto prisoners' bodies was used to track them via these machines.

It is therefore unlikely to expect every private company with the capacity to train a SOTA model to refrain from doing so. There would need to be a government body enforcing the ban.

This leads us to the final premise.

Premise 4:
A strongly enforced, global political agreement to prevent the training of SOTA models is unlikely

If there are powerful economic incentives for private actors to train SOTA models, then a strongly enforced government mandate is the only plausible path to achieve the AI Summer Harvest. However, countries are unlikely to unilaterally cripple their own technological capabilities. Thus the feasibility of achieving the Summer Harvest would depend on a high degree of international cooperation.

To evaluate this prospect, I will focus my attention to the dynamic between the US and its biggest AI competitor, China. I will argue that in the scenario in which the existential risk from training further models is not taken seriously by either country there will be little incentive to cooperate. I then argue that even in the scenario in which the risk is seriously considered, trust difficulties will prevent cooperation. Ultimately I argue that global cooperation to enforce a moratorium on training new models is unlikely.

***
I have no foreign policy qualifications and zero expertise on China. Take the following with a heavy grain of salt.

***

In "Vanishing Trade Space" (Cohen et al., 2023), the authors suggest that rough predictions about the likelihood of nations cooperating on a particular issue can be made by examining two factors: the Alignment of Interest between countries and the Stakes at Play for each country.

Alignment of Interest is determined by viewing each nation's public statements. To what extent do both nations publicly support the same concrete objectives?

Stakes at Play is a measure of how much the issue matters to each state. The stakes are considered to be high if it is matter of national survival or there are official statements that it is a core national security concern. The higher the stakes are, the less likely a country is to compromise. If the stakes are high for either country and there is little alignment of interest, there is unlikely to be cooperation.

I find that the likelihood of cooperation depends on the degree to which both countries take existential AI risk seriously.

If the existential risk from SOTA systems is not viewed as a major concern by either party, then there will be little Alignment of Interest between the US and China. China's State Council has indicated China is keenly focused on narrowing the lead of the United States, and the United States recognises the strategic importance of maintaining its lead. Dominance of the AI field is not currently believed to be an issue of national survival by either country, but it is still viewed as a priority. In the scenario where both countries don't view the threat from AGI as existential, a Summer Harvest is unlikely.

Is it possible that both the US and China recognise that continuing to train new SOTA models is harmful? Possibly. This year has seen an uptick in public dialogue in the West around AI Safety due to the explosive popularity of OpenAI's ChatGPT. Unfortunately I cannot speak for the dialogue within China. I've seen one paper in which Chinese scientists have explicitly highlighted risk from developing an AGI (Liu, 2021) although as far as I can tell it has zero citations.

In the case that there is mutual recognition of risk, China and the US would have strongly overlapping interests. The stakes would be very high, but it would be strongly within the best interests of both countries to cooperate. Unfortunately there would be a major obstacles to cooperation, a lack of trust and an inability to verify that the other party was following through with its commitments.

***

It would be reasonable for China or the US to conclude that being the first to train the next generation of AI would give them an economic (or military) edge, and the benefit for doing so first outweighed the potential risk. It would also be very difficult to detect if a server room is currently training an AI or performing some other function. The prospect of an international rival being given intimate access to computer infrastructure would be deemed an unacceptable security risk by both the US and China.

Thus it's very unlikely there is international cooperation to halt the training of new models.

Conclusion:

There Will Be No Summer Harvest

We can imagine a world in which there is simultaneously a widespread adoption of contemporary AI systems paired with a moratorium against training more advanced models, leading to an "AI Summer Harvest" as we safely reap the benefits of this technology. Unfortunately, there are strong economic and strategic reasons that disincentivize companies and nations from going along with this idea. Barring profit driven companies unilaterally showing an unusual level of restraint or unprecedented international cooperation, there will be no period of Summer Harvest.

I believe a more careful and accurate analysis of the would paint an even bleaker picture for the prospects of an AI Summer Harvest.

Further, assume the above argument is incorrect and we do enter a period of AI Summer Harvest. Notice that during the Summer Harvest that there would appear to be a strong incentives for groups to unilaterally end it. The economic benefit of training the next SOTA model (Premise 2) would be substantially greater if there was already widespread adoption of contemporary models.

References:

Brue, Melody. ‘Microsoft Announces New Generative AI Search Using ChatGPT; Bada BING, Bada Boom—The AI Race Is On’. Forbes, https://www.forbes.com/sites/moorinsights/2023/02/09/microsoft-announces-new-generative-ai-search-using-chatgpt-bada-bing-bada-boom-the-ai-race-is-on/. Accessed 25 July 2023.

Canberra, U. S. Embassy in. ‘Secretary Blinken Speech: A Foreign Policy for the American People’. U.S. Embassy & Consulates in Australia, 3 Mar. 2021, https://au.usembassy.gov/secretary-blinken-speech-a-foreign-policy-for-the-american-people/.

Cohen, Raphael S., et al. Vanishing Trade Space: Assessing the Prospects for Great Power Cooperation in an Era of Competition — A Project Overview. RAND Corporation, 20 Feb. 2023. www.rand.org, https://www.rand.org/pubs/research_reports/RRA597-1.html.

‘Dehomag’. Wikipedia, 27 Mar. 2023. Wikipedia, https://en.wikipedia.org/w/index.php?title=Dehomag&oldid=1146880496.

Dehomag [Deutsche Hollerith Maschinen] D11 Tabulator - Collections Search - United States Holocaust Memorial Museum. https://collections.ushmm.org/search/catalog/irn521586. Accessed 25 July 2023.

Eloundou, Tyna, et al. GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. arXiv, 23 Mar. 2023. arXiv.org, https://doi.org/10.48550/arXiv.2303.10130.

‘Full Translation: China’s “New Generation Artificial Intelligence Development Plan” (2017)’. DigiChina, https://digichina.stanford.edu/work/full-translation-chinas-new-generation-artificial-intelligence-development-plan-2017/. Accessed 25 July 2023.

‘GDP Contribution by Sector U.S. 2021’. Statista, https://www.statista.com/statistics/270001/distribution-of-gross-domestic-product-gdp-across-economic-sectors-in-the-us/. Accessed 25 July 2023.

Hall, Shannon. ‘Exxon Knew about Climate Change Almost 40 Years Ago’. Scientific American, https://www.scientificamerican.com/article/exxon-knew-about-climate-change-almost-40-years-ago/. Accessed 25 July 2023.

Kitman, Jamie Lincoln. The Secret History of Lead. 2 Mar. 2000. www.thenation.com, https://www.thenation.com/article/archive/secret-history-lead/.

Liu, Yuqing, et al. ‘Technical Countermeasures for Security Risks of Artificial General Intelligence’. Strategic Study of Chinese Academy of Engineering, vol. 23, no. 3, pp. 75–81. journal.hep.com.cn, https://doi.org/10.15302/J-SSCAE-2021.03.005. Accessed 25 July 2023.

McFarland, Michael J., et al. ‘Half of US Population Exposed to Adverse Lead Levels in Early Childhood’. Proceedings of the National Academy of Sciences of the United States of America, vol. 119, no. 11, Mar. 2022, p. e2118631119. PubMed, https://doi.org/10.1073/pnas.2118631119.

Murphy, Harry. ‘Dealing with The Devil: The Triumph and Tragedy of IBM’s Business with the Third Reich’. The History Teacher, vol. 53, no. 1, 2019, pp. 171–93. JSTOR, https://www.jstor.org/stable/27058571.

‘Myanmar: Facebook’s Systems Promoted Violence against Rohingya; Meta Owes Reparations – New Report’. Amnesty International, 29 Sept. 2022, https://www.amnesty.org/en/latest/news/2022/09/myanmar-facebooks-systems-promoted-violence-against-rohingya-meta-owes-reparations-new-report/.

Nardo, Cleo. AI Summer Harvest. www.lesswrong.com, https://www.lesswrong.com/posts/P98i7kAN2uWuy7mhD/ai-summer-harvest. Accessed 25 July 2023.

Singh, Kyle Wiggers, Devin Coldewey and Manish. ‘Anthropic’s $5B, 4-Year Plan to Take on OpenAI’. TechCrunch, 6 Apr. 2023, https://techcrunch.com/2023/04/06/anthropics-5b-4-year-plan-to-take-on-openai/.

[-]Lao Mein1y13-7

Given the state of the Chinese AI Safety community, there is basically zero chance of Chinese buy-in.

Last I checked, it was a bunch of expats doing community building and translating AI Safety videos and literally one guy with a startup (he was paying for it out of pocket). The leadership listens to their experts, and their experts barely even bother to pay lip service to AI safety rhetoric.

From the Chinese perspective, all American actions on the AI front have been heavily hostile thus far. Any GPU reduction treaties will be seen as an effort to cement US AI hegemony.

[-]Akash1y52

My understanding of your claim is something like:

Claim 1: Cooperation with China would likely require a strong Chinese AI safety community
Claim 2: The Chinese AI safety community is weak
Conclusion: Therefore, cooperation with China is infeasible

I don't have strong takes on claim 2, but I think (at least at first glance) disagree with claim 1. It seems quite plausible to imagine international cooperation without requiring strong domestic AI safety communities in each country that opts-in to the agreement. If the US tried sufficiently hard, and was willing to make trades/sacrifices, it seems plausible to me that it could get buy-in from other countries even if there weren't strong domestic AIS communities.

Also, traditionally when people talk about the Chinese AI Safety community, they often talk about people who are in some way affiliated with or motivated by EA/LW ideas. There are 2-3 groups that always get cited.

I think this is pretty limited. I expect that, especially as AI risk continues to get more mainstream, we're going to see a lot of people care about AI safety from different vantage points. In other words, there's still time to see new AI safety movements form in China (and elsewhere), even if they don't involve the 2-3 "vetted" AI safety groups calling the shots.

Finally, there are ultimately a pretty small number of major decision-makers. If the US "led the way" on AI safety conversations, it may be possible to get buy-in from those small number of decision-makers.

To be clear, I'm not wildly optimistic about unprecedented global cooperation. (There's a reason "unprecedented" is in the phrase!) But I do think there are some paths to success that seem plausible even if the current Chinese AI safety community is not particularly strong. (And note I don't claim to have informed views about how strong it is).

[-]Lao Mein1y50

My claim is that AI safety isn't part of the Chinese gestalt. It's like America asking China to support Israel for because building the Third Temple will bring the Final Judgement. Chinese leadership don't have AI safety as a real concern. Chinese researchers who help advise Chinese leadership don't have AI safety as a real concern. At most they consider it like the new land acknowledgments - another box they have to check off in order to interface with Western academia. Just busy work that they privately consider utterly deranged.

[-]Evan R. Murphy1y20

My claim is that AI safety isn't part of the Chinese gestalt.

Stuart Russell claims that Xi Jinping has referred to the existential threat of AI to humanity [1].

[1] 5:52 of Russell's interview on Smerconish: https://www.cnn.com/videos/tech/2023/04/01/smr-experts-demand-pause-on-ai.cnn

[-]lc1y00

I think in the universe where the President is personally asking Xi to consider this seriously, Xi has a good chance of considering it serious. I do not expect to live in that world, but it's not that much more unlikely than the one where America and the rest of the west rise to the occasion at all.

[-]Lao Mein1y52

Imagine you are President Obama. The King of Saudi Arabia gives you a phone call and tries to convince you to ban all US tech companies because they might inadvertently kill God. That is the place we are at in terms of Chinese AI safety awareness in the leadership.

[-]lc1y22

I reject the analogy. America is already winning the tech race, we don't have to ask China to give up a lead, and westerners do not worry about existential risk because of some enormous cultural gap whereby we care about our lives but Chinese people don't. This is a much easier bridge to cross than you are making it out to be.

[-]Mitchell_Porter1y42

Perhaps China's own AI experts have an independent counterpart of "AI safety" or "AI alignment" discourse? The idea of AIs taking over or doing their own thing is certainly in the Chinese pop culture, e.g. MOSS in The Wandering Earth.

[-]Lao Mein1y51

I've tried to find this for months, but all I've found are expats that are part of the Western AI safety sphere and a few Chinese discussing Western AI safety ideas on a surface level.

[-]the gears to ascension1y20

I'd have expected them to be pretty concerned about AI taking over the party. I thought they were generally pretty spooked by things that could threaten power structure. but I'm just an unusually weird American so I wouldn't really know much.

[-]sanxiyn1y92

Economic cost-benefit analysis of training SOTA model seems entirely wrong to me.

If training a new SOTA model enabled a company to gain just a fraction of the global search market

This is admirably concrete, so I will use this, but the point generalizes. This assumes Microsoft can gain (and keep) 1% of the global search market by spending $1B USD and training a new SOTA model, which is obviously false? After training a new SOTA model, they need to deploy it for inference, and inference cost dominates training cost. The analysis seems to assume inference cost is negligible, but that's just not true for search engine case which requires wide deployment. The analysis should either give an example of economic gain that does not require wide deployment (things like stock picking comes to mind), or should analyze inference cost, at the very least it should not assume inference cost is approximately zero.

[-]Stephen Fowler1y10

It was an oversight to not include inference costs, but I need to highlight that this is a fermi estimate and from what I can see it isn't enough of a difference to actually challenge the conclusion.

Do you happen to know what the inference costs are? I've only been able to find figures for revenue (page 29).

Do you think that number is high enough to undermine the general conclusion that there is billions of dollars of profit to be made from training the next SOTA model?

[-]sanxiyn1y21

I also am not sure it is enough to change the conclusion, but I am pretty sure "put ChatGPT to Bing" doesn't work as a business strategy due to inference cost. You seem to think otherwise, so I am interested in a discussion.

Inference cost is secret. The primary sources are OpenAI pricing table (ChatGPT 3.5 is 0.2 cents per 1000 tokens, GPT-4 is 30x more expensive, GPT-4 with long context is 60x more expensive), Twitter conversation between Elon Musk and Sam Altman on cost ("single-digits cents per chat" as of December 2022), and OpenAI's claim of 90% cost reduction since December. From this I conclude OpenAI is selling API calls at cost or at loss, almost certainly not at profit.

Dylan Patel's SemiAnalysis is a well respected publication on business analysis of semiconductor industry. In The Inference Cost Of Search Disruption, he estimates the cost per query at 0.36 cents. He also wrote a sequel on cost structure of search business, which I recommend. Dylan also points out simply serving ChatGPT for every query at Google would require $100B in capital investment, which clearly dominates other expenditures. I think Dylan is broadly right, and if you think he is wrong, I am interested in your opinions where.

You've convinced me that it's either too difficult to tell or (more likely) just completely incorrect. Thanks for the links and the comments.

Initially it was intended just to put the earlier estimate in perspective and check it wasn't too crazy, but I see I "overextended" in making the claims about search.

[-]knb1y31

I think the situation is broadly analogous to advanced nuclear reactors. Both have major strategic military, foreign policy and economic applications.

With high temperature reactors you can efficiently synthesize liquid fuels. Energy security is a major strategic goal of US foreign policy.
The US developed nuclear powered airplanes to a high level. Imagine strategic bombers that could fly continuously for weeks striking targets anywhere on earth without refueling or needing nearby airstrips.
Supersonic nuclear ramjets like Project Pluto were feasible as well.

Yet we considered the technology too dangerous, so we just stopped. And the Soviets also stopped. Humans are more corrigible than you might think.

No Summer Harvest: Why AI Development Won't Pause

14

The Argument

Premise 1: SOTA models will be relatively "cheap" to produce

Premise 2: The economic benefit of training the next SOTA system would be substantial

Intermediate Conclusion: There will be a financial incentive of the order of billions of dollars to train the next SOTA model and end the Summer Harvest

Premise 3: There are groups motivated by financial incentives who are both capable of training SOTA systems and willing to ignore the potential dangers

Premise 4: A strongly enforced, global political agreement to prevent the training of SOTA models is unlikely