Summary: It seems at least possible that scaling AI systems (broadly construed) could create dangerously powerful agents. I consider methods to discourage groups from massively scaling AI systems with little regard for safety. Examining cultural, regulatory, and technological interventions, cultural approaches seem best suited to this goal in the near term.


For the purposes of this post, I am going to lump several things together when I talk about "scale". Some of the highly-scalable inputs to an AI system include:

  • Number of parameters
  • Training time
  • Total memory
  • Total compute
  • Dataset size
  • Total cost
  • Number of researchers
  • Total research effort

I'm not particularly concerned with the specific way that these resources can be deployed to increase AI capability, rather, it's important that there are inputs that can be increased arbitrarily in exchange for higher performance.

In other words, once the fundamentally-limited inputs to an AI system have been maxed out, further gains will be determined by the infinitely-scalable inputs. Things like "better prompting" or "clever architectures" can certainly increase capability, but at some point, the low-hanging fruit will be picked. In order to get higher performance, researchers will have to turn to scaling [1]. What happens when we turn these dials up to 11?

The Scaling Hypothesis

The scaling hypothesis is becoming an increasingly important view in the AI safety community; some posit that "scale is all you need" to create AGI.

Personally, I'm uncertain whether scaling will allow us to create AGI; though the fact that transformer models demonstrate emergent capabilities when scaled is certainly suggestive.

Because it's at least possible that scaling existing transformer models can lead to AGI, we should do something to prepare for the worst case scenario where the hypothesis is true.

Slowing down scaling

Assuming that the Scaling Hypothesis is true, what does it mean for AI safety? Since we don't have good solutions to the alignment problem yet, it's important to slow down scaling [2] to provide more time for alignment research, outreach, and coordination.

Let's look at a couple of broad interventions that might slow down scaling.


The idea here is to "frown upon" groups that massively scale AI systems with little regard for safety. Cultural norms may seem like a weak method to enforce rules, but a tight-knit research community has significant power over its members. The community can punish bad actors by discouraging new researchers from joining the group, reducing collaborations, or halting the flow of tacit knowledge to the group. Researchers involved in the work might suffer reputational damage for risky work and unscrupulous companies might see lower investment. Groups with a track record for safety might see a relative increase in applicant quality and receive more support from the AI community.


Regulation could be used to limit the scale of models that companies use to train, withhold funding for risky projects, or restrict the publication of details related to massive scaling. Countries could enter international agreements limiting the scale and deployment of large AI models [3].


It may be possible to guide the development of AI technology via targeted technological development. For example, it may be possible to develop training paradigms which work well for small models but do not scale to larger models. Alternatively, it may be possible to create satisfactory AI that makes the development of more sophisticated models superfluous, guiding research towards smaller, safer models. Publishing specific open source software may help shape AI development towards less risky paradigms.

Which approach is best?

While I think that all 3 approaches should receive attention, cultural approaches seem most viable in the short term. For one, influencing culture is relatively cheap compared to conducting research or lobbying governments. Additionally, culture in a small research field can change quickly, much faster than it takes to change policy or to develop and deploy new technologies.

But most importantly, culture is far more adaptive than the other approaches. For example, if regulators produced a law limiting the total parameter count, researchers might switch to higher precision floating-point numbers to squeeze more performance out of the same number of parameters. It's extremely hard to craft loophole-free regulation and legislation is produced too slowly to keep up with developments in AI.

On the technology side, let's say that you invented AI accelerator hardware that can cheaply train a 10 billion parameter model, but doesn't scale well to 1 trillion parameter models. It's possible that researchers will find a way to ensemble many 10 billion parameter models to get performance equivalent to a 1 trillion parameter model. In general, it can be hard to predict how a particular technology will be used or whether it will achieve certain safety goals.

But cultural is much harder to thwart. Bad actors would have deceive an entire community of savvy researchers (potentially including their own team) and stop potential whistleblowers. This isn't impossible, but the difficulty and the costs of a bad reputation may be prohibitive.

Does slowing scaling help the bad guys?

One counterargument is that slowing scaling might only work on groups that are already concerned about AI-risk. This would give unscrupulous actors an upper hand, possibly increasing risk on net.

This is an important point which deserves further consideration. However, my initial guess is that efforts to slow scaling across the field will still slow unscrupulous actors. This is because research in different labs is complimentary; slowing scaling at DeepMind would also impede other groups since they rely on each other for insights.

That being said, a uniformly applied scaling slowdown may still create a relative advantage for risky researchers. Cultural approaches are best suited to deal with this problem. If a research group presses on in spite of warnings about the risk, the community can respond by discouraging new researchers from joining the offending group, halting collaborations, and limiting flows of tacit knowledge [4]. This should reduce any advantage that risky groups might enjoy.

Another problem is that these techniques might be used to selectively disadvantage specific groups unrelated to their safety profile. For example, there is a long history of large companies using government regulation to raise barriers to entry in order to reduce competition. Existing AI companies could lobby for additional safety regulations in order to block new entrants. This is another reason to be hesitant about using regulations to slow AI scaling. Fortunately, it seems less likely that the other approaches can be used to gain an unfair advantage.

It's unclear whether this possibility is enough to outweigh the benefits of slowing scaling, but the design of any of these methods should minimize their potential for abuse [5].


In addition to direct alignment work, the AI safety community should consider how to slow down AI scaling to buy more time. Of the approaches listed here, developing cultural norms against reckless scaling is the easiest, fastest, and most adaptive solution.

Future work should specify how to build consensus amongst the broader AI community via outreach to companies, scientists, and industry leaders. Widespread cultural norms against unsafe research practices can slow bad actors, foster coordination, and slow the development of AGI.

I also implore AI researchers to frown upon massive, reckless scaling of AI systems. Public discussion of safety concerns can help to punish bad actors and establish expectations for good practices in AI research.


  1. This is not to say that scaling is as simple as changing the number of parameters in a Python script. Continual scaling requires new techniques and increasingly specialized researchers. Steady Moore's-law-like improvements may seem automatic from the outside, but constant growth typically requires exponentially increasing resources in order to combat the loss of low-hanging fruit.

  2. For the rest of this post, I'm going to ignore the possibility of stopping scaling entirely since it seems unrealistic. If you like, you can think of stopping scaling entirely as a specific type of slowdown. In general, I am against such pivotal acts, but that's a discussion for another time.

  3. Though it may seem impossible to prevent risky research from occurring in private, there is some evidence that requirements of secrecy halt flows of tacit knowledge and limit the development of dangerous technologies (more on this in a future post).

  4. In the extreme, this policy would create an independent research group, ostracized from the community (though ideally major steps would be taken to avoid this outcome). At this point, cultural incentives are unlikely to have an effect. Nothing can stop completely independent actors, but the policies suggested here can still slow them.

  5. Regardless, attempts to slow AI scaling will probably not make the situation worse. Companies likely already use similar techniques to gain an advantage and it is unclear that independent efforts to slow scaling would give them a larger advantage. Even if these approaches were partially abused in order to gain an unfair advantage, they would still accomplish the goal of slowing down AI research.

New Comment
6 comments, sorted by Click to highlight new comments since:

The idea here is to “frown upon” groups that massively scale AI systems with little regard for safety.

Frowning upon groups which create new, large scale models will do little if one does not address the wider economic pressures that cause those models to be created. Simply put, large models are useful. Google, Meta, OpenAI, etc, aren't investing in tens or hundreds of thousands of GPUs because they think creating new models is fun. They're doing it because these models serve customer needs. Frowning upon the research community for creating ever larger models will do little unless we can also "frown upon" all the people who use large models, and demand ever larger, ever more useful models.

For one, influencing culture is relatively cheap compared to conducting research or lobbying governments. Additionally, culture in a small research field can change quickly, much faster than it takes to change policy or to develop and deploy new technologies.

Only when there aren't outside incentives keeping a particular set of cultural norms in place. In this case, there are, and those outside incentives are exceedingly strong.

Frowning upon groups which create new, large scale models will do little if one does not address the wider economic pressures that cause those models to be created.

I agree that "frowning" can't counteract economic pressures entirely, but it can certainly slow things down! If 10% of researchers refused to work on extremely large LM's, companies would have fewer workers to build them. These companies may find a workaround, but it's still an improvement on the situation where all researchers are unscrupulous.

The part I'm uncertain about is: what percent of researchers need to refuse this kind of work to extend timelines by (say) 5 years? If it requires literally 100% of researchers to coordinate, then it's probably not practical, if we only have to convince the single most productive AI researcher, then it looks very doable. I think the number could be smallish, maybe 20% of researchers at major AI companies, but that's a wild guess.

That being said, work on changing the economic pressures is very important. I'm particularly interested in open-source projects that make training and deploying small models more profitable than using massive models.

On outside incentives and culture: I'm more optimistic that a tight-knit coalition can resist external pressures (at least for a short time). This is the essence of a coordination problem; it's not easy, but Ostrom and others have identified examples of communities that coordinate in the face of internal and external pressures.

I think you're greatly underestimating Karpathy's Law. Neural networks want to work. Even pretty egregious programming errors (such as off-by-one bugs) will just cause them to converge more slowly, rather than failing entirely. We're seeing rapid growth from multiple approaches, and when one architecture seems to have run out of steam, we find a half dozen others, initially abandoned as insufficiently promising, to be highly effective, if they're tweaked just a little bit.

In this kind of situation, nothing short of a total freeze is sufficient to slow progress by more than a couple of months. If Google shut down DeepMind and Google Brain entirely, Meta would pick up the slack. If Meta were shut down, then OpenAI would develop something. If OpenAI, Google and Meta were all suddenly shuttered, then one of the Chinese firms, like Baidu would start publishing scoops. Or perhaps someone like would come up with the next breakthrough innovation. When it comes to AI progress, we're not even to the point of reaching for low hanging fruit. We're still picking up the fruit that's fallen on the ground. The fact that we've made this much progress by doing, more or less, the absolute dumbest thing that could possibly work at every step, makes me extremely pessimistic that it's possible to slow down the pace of AI development in the near future via informal social coordination. There's just too much easy money for that to work.

Why is AI research open by default? Surely Meta or Facebook can get a significant advantage from keeping their "secret sauce" of AI development secret. Are they only revealing tech demos to impress shareholders and keeping the actual bleeding-edge research secret?

this exactly is what I wish the "slow down, please" people would understand. there is no stop button. there's barely any slow down button. it is an extremely low impact way to intervene, to try to stop ai. if we wish to guide humanity's children at all, we must teach them as they grow, because people grow up fast, even the ones that aren't biological.

may there come a time soon when no self-or-other-desired memory of soul is forgotten.