Systems Dynamics Model for Pausing AI

WillPetillo; lwen2027; Isabela Ciurea; jamaklycheva

Summary

This post documents our process of applying systems dynamics modeling to the problem of AI governance, tracing the feedback loops connecting capability development, public harm, and regulatory constraint. Our research outputs include a model created in Insight Maker, step-by-step documentation, and a set of causal narratives informing the design. We also have a video presentation that covers the same material as this post. This project was part of AISC 2026.

Background

A system dynamics model is a diagram with stocks, flows, user-set parameters, and calculated variables that can be used to generate a simulation, such as a line graph showing the projected state of key variables over time. Visual diagrams are useful for describing interconnected systems, compared to writing which is inherently linear. Simulation is useful for generating “what-if” scenarios, often of the form: how will the trajectory of variable X change as I raise/lower variables A, B, and C? Moreover, it helps to highlight which variables are more salient to a given problem, which may not be obvious at a glance.

Figure 1: SD model (left), simulation output (right)

Such implications are, however, entirely dependent on the assumptions of the model. Modelling a large, complex, and novel subject like the intersection of AI development and international governance requires stacking many deeply uncertain—and often controversial—claims, making the end result somewhat arbitrary. Scientists often prefer parsimonious and defensible models with empirical and theoretical grounding. However, given the complexity and novelty of the subject, we find more value by working in the opposite direction: the model encodes our beliefs about causal structure and our intuitive sense of plausible outcomes acts as a constraint on those beliefs. That is, if an outcome is implausible then the logic that generated it should be reconsidered.

The Reification Trap

The use of System Dynamics (SD) modelling as a way of understanding the world has faced persistent criticism. A core objection is that representing poorly understood concepts as precise numbers makes them look like settled ideas, projecting a false sense of certainty. This objection, however, assumes the model represents a formalization of general understanding, with imprecision representing failure. Done correctly, however, SD modeling of AI governance does the opposite: mapping a slice of something necessarily larger and more complex than can be specified. For a deeper treatment of this distinction, see Lenses of Control.

Figure 2: Specificity vs confidence - different words with different meanings

Indeed, our experience developing this model was the opposite of what the reification objection assumes: the surface area of known unknowns expanded as the model developed. Assumptions that had passed unexamined in our implicit mental models became visible and uncomfortable the moment they had to be made precise enough to calculate with. Honest specificity generates uncertainty rather than suppressing it. Productive engagement with the model extends beyond parameter tuning to include scrutinizing loop structure, node inclusion, connection direction, and organizational scope.

Research Process

When modelling a massive topic one is still trying to wrap one’s mind around, it is hard to know where to start. Jumping straight into Insight Maker led to a lot of dead ends, often involving excessively detailed subsystems that were structured in a way that made them difficult to meaningfully connect together.

We started to make some progress with the introduction of causal stories on the current general state of Artificial Intelligence, which were initially a brain-dump of plain-English summaries of relevant feedback loops. We then split into two sub-teams. One dug through existing research literature to find evidence supporting or challenging the stories and iterated on them accordingly. The other used the stories to draw up mind maps as FigJam boards to start drawing connections between the most important stories. Once we settled on a map that made sense, we could then focus on converting that informal structure into more rigorous SD logic. Places where adding such detail proved difficult (either because the forced specificity required us to confront earlier handwaving or because the model started producing unreasonable outcomes) pointed to gaps in understanding that then required further research.

Figure 3: Research Loop

Scope creep was a constant problem and determining the proper scope of analysis was itself part of the project. We managed scope via a set of guiding principles rather than defining boundaries at the outset. To justify inclusion, each node needed to contribute towards filling out a feedback loop described by one or more significant causal stories. Each story needed to relate to capability-limiting governance and be backed by research. Every node in the model could be expanded into a model of its own, so our criteria for node expansion was that the node be important to overall system behavior AND its current outputs (when unexpanded) diverge from known real-world patterns AND the nature of improving conceptual accuracy at this point requires adding inputs. Where it is not clear how to expand an important node, that points to a potentially useful research question.

We also simplified the math in the model by describing everything on a 0-1 scale and defining terms accordingly. For example, in AI capabilities, a 1 means “full AGI” and is the point where many assumptions of the model break down. Defining terms in this way also requires us to represent everything as continuous variables, which adds some artificiality when describing discrete events. For example, the sort of harms that increase public grievance are often major, headline events that attract media attention, which occur at unpredictable intervals. Our model simplifies this dynamic to an “incident rate,” which describes something more like expected value.

Even after gatekeeping node inclusion, simplifying math, and operating at a high level of abstraction, the model still became rather complex. Managing this relevant complexity is a presentation problem. Logic that would clutter the visual graph is hidden in macro code; related nodes are grouped in collapsible folders; and only a curated subset of parameters are surfaced as user-adjustable knobs, with others accessible only by directly editing nodes or global macro values. Insight Maker's Story Mode, which would allow guided walkthroughs of the model, remains future work.

Model Overview

In the interests of making our full model easier to absorb, what follows is an overview of some key parts. The documentation breaks the model down in full detail, but Figure 4 illustrates its overall organization in 3 nested layers.

Figure 4: Conceptual Layers, corresponding to the nested, collapsible folders in the InsightMaker model.

At the base, Reality contains the technical and market dynamics driving AI development. These feed upward into Perception, which includes present and future harms. Perception in turn feeds into the Response layer, where political dynamics (industry capture, activism, and public concern) interact to shape domestic and international regulation. A feedback arrow from Response back down to Reality completes the loop, reflecting the model's core premise that regulatory response acts as a constraint on the growth dynamics at the base.

Figure 5 illustrates what we see as the core feedback loops.

Figure 5: Core Feedback Loops

Blue: capabilities feed on themselves, each advancement facilitating further advancements (through better tooling and encouraging investment), generating an exponential.
Purple: safety research leverages capabilities to increase absolute control capacity (through better tooling and having more advanced AI models to study) while competing against capabilities for applied control capacity.
Green: advancing capabilities leads to loss of control (increasing future risk) as well as externalized costs onto society (increasing present harms), which combine to drive public grievance against AI, leading to regulation that limits AI capability advancement.
Orange: activism and vested interests respectively accelerate and obstruct the conversion of present and future harms into regulation.

We designed the user adjustable knobs with activist organizations and policymakers working on AI governance in mind as the primary audience. The knobs are an invitation to explore: adjust the parameters to reflect your own beliefs and observe how the system's behavior changes (or doesn't). Many additional exogenous variables are accessible to those willing to dig deeper, either by editing node values directly or adjusting global values in the macro code. Full documentation of individual parameters, their justifications, and their relationships to the model's causal structure is in our documentation.

Sample Walkthrough: Capabilities vs. Safety

AI capabilities are self-reinforcing: each advance accelerates the research and development that produces the next. Safety research runs a balancing loop against this, since AI systems can accelerate alignment work. However, the same capabilities that help safety research also increase the power of systems that may be misaligned. A central question is whether the balancing loop can outpace the reinforcing one before systems become too dangerous to contain by any means.

Figure 6: Capabilities vs. Safety mind map

Sample Walkthrough: the Governance Gap

Governance must translate technical risk into regulatory action, but this path has significant loss and delay at every step. Expert alarm can drive policy, but defeatism within the research community and weak whistleblower protections may erode that channel. Open-source releases are difficult to reverse and autonomous replication represents a potential point of no return. Compute governance offers a near-term lever for control, but distributed training could erode its effectiveness.

Figure 7: Governance Gap mind map

Sample Walkthrough: Regulatory Capture

Acceleration leads to economic lock-in, which funds lobbying pressure against pausing, which allows for continued acceleration in a self-reinforcing loop. The dominating narrative sets the gain or attenuation of this loop. When strategic competition crowds out safety as the dominant frame, regulatory goals shift accordingly. Voluntary commitments shift the perceived need for regulation, preempting binding rules with low, self-reported bars. Safety redefinition—industry narrowing "safety" to mean low-stakes problems—absorbs political pressure without addressing underlying risk. Visible incidents and scandals can potentially disrupt all of these mechanisms, which suggests that incident response speed and transparency are among the highest-leverage governance variables in the model.

Figure 8: Regulatory Capture mind map

Findings

The path from advancements in AI capabilities to governance has a lot of steps, each with falloff and delay, as shown in Figure 9.

Figure 9: mind map of Incident Response and Political Capture pathways, discussed in more detail here.

Given the rapid pace of AI development, anything that skips or significantly decreases the loss or delay in any of these steps is critical. Transparency bills, such as the AI Risk Evaluation Act, may on their face seem too light-touch to matter, but may be important if they can significantly decrease government response time to future incidents.

Even more important, however, is introducing gain into the system to counteract any loss. For example, public grievance could exceed awareness if the public overreacts to an incident after becoming aware of it, whether that be through an irrational misunderstanding, a rational assessment of what the incident implies about the future, or a semi-rational punitive impulse. Likewise, political response could be greater than demanded by political pressure if representatives acted as proactive leaders rather than responsive followers.

As one mini model we created illustrates, stopping a self-reinforcing process requires that the rate at which the gap between capacity and limiter converts into limiter inflow must be greater than the capacity acceleration rate. When the ratio between these variables is less than 1, capacity continues to grow exponentially; when it is greater than 1, capacity eventually levels off; when it is 1, capacity grows linearly. All other variables, such as initial capacity and absolute acceleration rate, only affect timelines and the specific cutoff level—that is, changing the scale of the graph, not its shape.

Figure 10: Capacity-Limiter mini-model outputs at different conversion / acceleration ratios. All other variables affect X and Y scale only.

Interpreting Outputs

There are two schools of thought on what SD modeling is for. In one, the simulation output is the primary product and the diagram is scaffolding. In the other (which ours follows), making causal structure explicit is itself the insight-generating act—output mainly serves as a consistency check.

Sensitivity analysis (systematically varying parameters across reasonable ranges to test output robustness) is the natural next step for validating structural claims, because fragile outputs requiring precise parameter settings would be a signal of modeling error. Because of time and tool limitations, however, this is future work.

Conclusion

The primary value of creating this model was as a means of focusing research effort to improve our understanding of AI governance. The feedback loop structure makes AI safety dynamics legible in a way that is difficult to capture in prose: tracing a pathway visualizes causal relationships, operationalizing terms requires specific meanings, and disagreeing with the model outputs demands a targeted search for the missing or misdirected factor producing it.

Going forward, we see the potential of the model as an educational resource first and a research artifact second. Someone wanting to engage seriously with AI governance questions could find a clear starting point by studying the model, locating where their intuitions diverge from its structure, and either updating those intuitions or improving the model. (Yes, that includes you: contributions, critiques, and extensions are welcome.)

13