I'm curious for your reasons for choosing PBC. Having looked into it a little, it seems like:
If these are close enough, then AGI safety products are a reasonable idea. If not, they’re an actively bad idea because the new incentives pull you in a less impactful direction.
I've heard this argument a lot, but I think it has a key failure mode: it assumes incentives are more or less invariant with scale (of org size, staff, cash on hand, etc). As they say in the Valley "most problems are solved with scale".
Another heuristic common in the startup world but which seems to not occur to people on this forum too much: many successful startups do not start intending to build in their actually successful direction—they pivot at least once.
Together, these facts to me imply a dynamic where building a successful company is both very compatible with AGI safety in the long run, but going about it by trying to backchain from AGI safety seems unlikely to work. Instead, good founders are likely to pivot around a bunch to find PMF. But even if the idea with PMF is not the maximally AGI safe one to build, that doesn't mean it's a bad idea! If you build that, then come back a year later with much more capital and a large team to build a second product, then you are in a significantly better position than if you had tried to build for AGI safety from the start.
I think a common failure mode amongst the AI safety startup market is "trying to build the perfect product from the start"; trying to find the narrow path that has both safety and 0 to 1 profitability for a company with no revenue or existing customers.
This is a personal post and does not necessarily reflect the opinion of other members of Apollo Research. This blogpost is paired with our announcement that Apollo Research is spinning out from fiscal sponsorship into a PBC.
Summary of main claims:
Definition of AGI safety products
Products that both meaningfully increase AGI safety and are profitable
Desiderata / Requirements for AGI products include:
There are multiple fields that I expect to be very compatible with AGI safety products:
There are multiple companies and tools that I would consider in this category:
Solar power analogy: Intuitively, I think many other technologies have gone through a similar trajectory where they were first blocked by scientific insights and therefore best-placed in universities and other research institutes, then blocked by large-scale manufacturing and adoption and therefore better placed in for-profits. I think we’re now at a phase where AI systems are advanced enough that, for some fields, the insights we get from market feedback are at least as useful as those from traditional research mechanisms.
Argument 1: Sufficient Incentive Alignment
In my mind, the core crux of the viability of AGI safety products is whether the incentives to reduce extreme risks from AGI are sufficiently close to those arising from direct market feedback. If these are close enough, then AGI safety products are a reasonable idea. If not, they’re an actively bad idea because the new incentives pull you in a less impactful direction.
My current opinion is that there are now at least some AI safety subfields where it’s very plausible that this is the case, and market incentives produce good safety outcomes.
Furthermore, I believe that the incentive landscape has undergone rapid changes since late 2024, when we first observed the emergence of “baby versions” of theoretically predicted failure modes, such as situationally aware reward hacking, instrumental alignment faking, in-context scheming, and others. Normal consumers now see these baby versions in practice sometimes, e.g., the replit database deletion incident.
Transfer in time: AGI could be a scaled-up version of current systems
I expect that AI systems capable of automating AI research itself will come from some version of the current paradigm. Concretely, I think they will be transformer-based models with large pre-training efforts and massive RL runs on increasingly long-horizon tasks. I expect there will be additional breakthroughs in memory and continual learning, but they will not fundamentally change the paradigm.
If this is true, a lot of safety work today directly translates to increased safety for more powerful AI systems. For example,
This is a very load-bearing assumption. I would expect that anyone who does not think that current safety research meaningfully translates to systems that can do meaningful research autonomously should not be convinced of AGI safety products, e.g., if you think that AGI safety is largely blocked by theoretical progress like agent foundations.
Transfer in problem space: Some frontier problems are not too dissimilar from safety problems that have large-scale demand
There are some problems that are clearly relevant to AGI safety, e.g., ensuring that an internally deployed AI system does not scheme. There are also some problems that have large-scale demand, such as ensuring that models don’t leak private information from companies or are not jailbroken.
In many ways, I think these problem spaces don’t overlap, but there are some clear cases where they do, e.g., the four examples listed in the previous section. I think most of the relevant cases have one of two properties:
Therefore, for these cases, it is possible to build a product that solves a large-scale problem for a large set of customers AND that knowledge transfers to the much smaller set of failure modes at the core of AGI safety. One of the big benefits here is that you can iterate much quicker on large-scale problems where you have much more evidence and feedback mechanisms.
Argument 2: Taking AGI & the economy seriously
If you assume that AI capabilities will continue to increase in the coming years and decades, the fraction of the economy in which humans are outcompeted by AI systems will continue to increase. Let’s call this “AGI eating the economy”.
When AGI is eating the economy, human overseers will require tools to ensure their AI systems are safe and secure. In that world, it is plausible that AI safety is a huge market, similar to how IT security is about 5-10% of the size of the IT market. There are also plausible arguments that it might be much lower (e.g., if technical alignment turns out to be easy, then there might be fewer safety problems to address) or much higher (e.g., if capabilities are high, the only blocker is safety/alignment).
Historically speaking, the most influential players in almost any field are private industry actors or governments, but rarely non-profits. If we expect AGI to eat the economy, I expect that the most influential safety players will also be private companies. It seems essential that the leading safety actors genuinely understand and care about extreme risks, because they are, by definition, risks that must be addressed proactively rather than reactively.
Furthermore, various layers of a defence-in-depth strategy might benefit from for-profit distribution. I’d argue that the biggest lever for AGI safety work is still at the level of the AGI companies, but it seems reasonable to have various additional layers of defense on the deployment side or to cover additional failure modes that labs are not addressing themselves. Given the race dynamics between labs, we don’t expect that all AI safety research will be covered by AI labs. Furthermore, even if labs were covering more safety research, it would still be useful to have independent third parties to add additional tools and have purer incentives.
I think another strong direct impact argument for AI safety for profits is that you might get access to large amounts of real-world data and feedback that you would otherwise have a hard time getting. This data enables you to better understand real-world failures, build more accurate mitigations, and test your methods on a larger scale.
Argument 3: Automated AI safety work requires scale
I’ve previously argued that we should already try to automate more AI safety work. I also believe that automated AI safety work will become increasingly useful in the future. Importantly, I think there are relevant nuances to automating AI safety work, i.e., your plan should not rely on some vague promise that future AI systems will make a massive breakthrough and thereby “solve alignment”.
I believe this automation claim applies to AGI developers as well as external safety organizations. We already see this very clearly in our own budget at Apollo, and I wouldn't be surprised if compute is the largest budget item in a few years.
The kind of situations I envision include:
For these stacks, I presume they can scale meaningfully with compute. I’d argue that this is not yet the case, as too much human intervention is still required; however, it’s already evident that we can scale automated pipelines much further than we could a year ago. Extrapolating these trends, I would expect that you could spend $10-100 million in compute costs on automated AI safety work in 2026 or 2027. Though I think the first people to be able to do that are organizations that are already starting to build automated pipelines and make conceptual progress now on how to decompose the overall problem into more independently verifiable and repetitive chunks.
While funding for such endeavors can come from philanthropic funders (and I believe it should be an increasingly important area of grantmaking), it may be easier to raise funding for scaling endeavors in capital markets (though this heavily depends on the exact details of that stack). In the best case, it would be possible to support the kind of scaling efforts that primarily produce research with philanthropic funding, as well as scaling efforts that also have a clear business case through private markets.
Additionally, I believe speed is crucial in these automated scaling efforts. If an organization has reached a point where it demonstrates success on a small scale and has clear indications of scaling success (e.g., scaling laws on smaller scales), then the primary barrier for it is access to capital to invest in computing resources. While there can be a significant variance in the approach of different philanthropic versus private funders, my guess is that capital moves much faster in the VC world, on average.
Argument 4: The market doesn’t solve safety on its own
If AGI safety work is profitable, you might argue that other, profit-driven actors will capture this market. Thus, instead of focusing on the for-profit angle, people who are impact-driven should instead always focus on the next challenge that the market cannot yet address. While this argument has some merit, I think there are a lot of important considerations that it misses:
Thus, I believe that counterfactual organizations, which aim solely to maximize profits, would be significantly less effective at developing high-quality safety products than for-profit ventures founded by individuals driven by the impact of AGI safety. Furthermore, I’d say that it is desirable to have some mission-driven for-profit ventures because they can create norms and set standards that influence other for-profit companies.
Limitations
I think there are many potential limitations and caveats to the direction of AGI safety products:
Conclusion
AGI safety products are a good idea if and only if the product incentives align with and meaningfully increase safety. In cases where this is true, I believe markets provide better feedback, allowing you to make safer progress more quickly. In the cases where this is false, you get pulled sideways and trade safety progress for short-term profits.
Over the course of 2025, we’ve thought quite a lot about this crux, and I think there are a few areas where AGI safety products are likely a good idea. I think safety monitoring is the most obvious answer because I expect it to have significant transfer and that there will be broad market demand from many economic actors. However, this has not yet been verified in practice.
Finally, I think it would be useful to have programs that enable people to explore if their research could be an AGI safety product before having to decide on their organizational form. If the answer is yes, they start a public benefit corporation. If the answer is no, they start a non-profit. For example, philanthropists or for-profit funders could fund a 6-month exploration period, and then their funding retroactively converts to equity or a donation depending on the direction (there are a few programs that are almost like this, e.g., EF’s def/acc, 50Y’s 5050, catalyze-impact, and Seldon lab).