AI Safety Needs Startups

LTM; joshlandes

Summary:

Startups can become integrated in the AI supply chain, giving them good information about valuable safety interventions. Safety becomes a feature to be shipped directly to users by virtue of this market position.
Better access to capital, talent, and ecosystem-building is available to for-profits than non-profits. VC funding dwarfs philanthropic funding, and there is little reason to believe that profitable safety-focused businesses aren’t possible.
Joining a frontier lab is a clear alternative, but most AI deployment happens outside labs. Your marginal impact inside a large organisation is often smaller than your impact when founding something new. Equally, profitable businesses aren’t an inevitability. You should seriously consider working for or founding an AI safety startup.

Introduction

Markets are terrible at pricing safety. In the absence of regulation, companies cut corners and externalise risks to society. And yet, for-profits may be the most effective vehicle we have for deploying safety at scale. Not because the incentives of capitalism align by chance with broader human values, but because the alternatives lack the resources, feedback loops, and distribution channels to turn safety insights into safer outcomes. For-profits are far from perfect, but have many advantages and a latent potential we should not ignore.

Information, Integration, and Safety as a Product

For advanced AI, the attack surface is phenomenally broad. It makes existing code easier to crack. Propaganda becomes cheaper to produce and distribution of it becomes more effective. As jailbreaking AI recruiters becomes possible, so does the data-poisoning of entire companies.

Information about new threats and evolving issues isn’t broadcast to the world. Understanding where risk is most severe and how it can be mitigated is an empirical question. We need entities embedded all across the stack, from model development to deployment to evaluation. We need visibility over how this technology is used and misused, and enough presence to intervene when needed. ‘AI safety central command’ cannot provide all these insights. Researchers acting without direct and constant experience with AI deployment cannot identify the relevant details.

Revenue is a reality check. If your product is being bought, people want it. If it isn’t, they either don’t know about it, don’t think it’s worth it, or don’t want it at all. For-profits learn what matters in an industry directly from the people they serve, giving the best insights money could buy.

This is not to say that AI safety non-profits aren’t valuable. Many do critical work which is difficult to support commercially. But by focusing entirely on research or advocacy and ignoring the commercial potential of their work, organisations cut themselves off from a powerful source of feedback. Research directions, careers, and even whole organisations can be sustained for years by persuading grantmakers and fellow researchers of a thesis, rather than proving value to people who would actually use the work. Without this corrective pressure, even well-intentioned research may drift from what the field actually needs. Commercialisation should not be seen as a distraction or a response to limited funding, but as a tool for staying at the bleeding edge of what is useful for the world.

Productification

Turning research into a product people can buy is extremely powerful for distribution. You are no longer hoping that executives, engineers, and politicians see value in work they do not understand tackling risks they may not believe in. It becomes a purchase. A budget decision. A risk-reward tradeoff that large organisations are very well suited to engage with.

There are clear gaps in securing AI infrastructure which can be filled today. If you’re wondering what an AI safety startup might actually do, here are some suggestions for commercial interventions targeting different parts of the stack.

Frontier Models: Interpretability tooling, evaluations infrastructure, and formal verification environments. Tools which might be implemented by labs and companies with direct access to frontier models to understand and control them better.
Applications: Content screening, red-teaming as a service, and monitoring for misuse. Helping startups building on frontier models catch accidental or deliberate misuse of their platforms.
Enterprise Deployments: Observability platforms, run-time guardrails, and hallucination detection. Enterprises and governments using AI to automate critical work should be able to catch issues early and reliably.
Market Incentives: Model audit and certification, and safety-linked insurance. Creating market incentives which reward safer models when they’re released into the world.

None of these require waiting for frontier labs to solve alignment, or hoping that someone else finds your work and decides to implement it. Instead of writing white papers hoping governments will regulate or frontier labs will dutifully listen, you build safety directly into products that customers come to rely on. One path hopes someone will do the work, whereas the other is the work.

Safety Across The Stack

When you tap your card at a shop to make a purchase, a network of financial institutions plays a role in processing your transaction. The point of sale system reads your card information, sends it on to a payment processor, who forwards the request to the appropriate card network. The issuing bank for your card authorises the transaction, the money is sent to cash clearing systems, and cash settlement is performed often through a central bank settlement system.

There is a sense in which all fraud happens at a bank. They have to release the fraudulent funds, after all. But the declaration that all fraud prevention initiatives should be focused on banks and banks alone comes across as fundamentally confused. Fraud prevention might be easier at other layers, and refusing to take those opportunities simply because it is in principle preventable at some more central stage would not lead to the best allocation of resources.

Similarly, when a user prompts an AI application, they are not simply submitting an instruction directly to a frontier model company. Just as tapping your card does more than instruct your bank, such a message goes through guardrails, model routing, observability layers, and finally frontier model safety measures. Every step of this process is an opportunity for robustness we should not let go to waste.

This becomes even more critical as AI agents begin acting autonomously in the world, doing everything from browsing and transacting to writing and executing sophisticated code. When an agent’s action passes through multiple services before having an effect, every link in that chain is both a potential failure point and an opportunity for a safety check.

Exclusively focusing AI safety interventions on frontier labs would be like securing the entire financial system by regulating only banks. Necessary, but nowhere near the most efficient or robust approach.

Capital, Talent, and Credibility

Successful for-profits are in an inherently better position to acquire resources than non-profits. Their path to funding, talent acquisition, and long-term influence is far stronger than that of their charitable counterparts.

There is an immense amount of venture capital washing around the AI space, estimated at ~$190 billion in 2025. Flapping Airplanes raised $180 million in one round, comparable to what some of the largest AI safety grantmakers deploy annually, raised in a fraction of the time. VC allows you to raise at speed, try many approaches, and pivot more freely than would be possible in academia or when reliant on slower charitable funders.

In AI safety, non-profits are less likely compared to other sectors to be trapped in an economic struggle for survival. However, even in the AI safety ecosystem, philanthropy is much more limited than venture capital and more tightly concentrated among fewer funders. Non-profits are vulnerable, not just to the total capital available, but to the shifting attitudes of the specific grant makers they rely on. VC-backed companies, by contrast, are much more resilient to the ideological priorities of funders. If one loses interest, many others remain available as long as you have a strong business case.

Yes, there is a large amount of philanthropic capital in AI Safety compared with typical non-profits. Safety products can also be difficult to sell. But the question of whether safety-focused products sell well, as they do in other industries, is a hypothesis you can go out and test. If it turns out that they do, there could be an immense amount of capital available which should be used to make our world safer.

For-profits attract talented people not just through hefty pay packages, but also through their institutional prestige and the social capital they confer. You can offer equity to early employees, which is extremely useful for attracting top technical talent and is entirely unavailable to non-profits. Your employees can point to growing valuations, exciting products with sometimes millions or even billions of users, and influential integration of their technology with pride. For many talented and competent people, this is far more gratifying than publishing research reports or ever so slightly nudging at the Overton window.

All of this, increased access to talent, capital, and credibility, makes for-profits far easier to scale. And safety needs to scale. The amount of time we have until transformative AI arrives differs wildly between forecasts, though it seems frighteningly plausible that we have less than a decade to prepare. If we are to scale up the workforce, research capacity, and integration into the economy of safety-focused products, we cannot afford anything other than the fastest approach to building capacity.

Success compounds. Founders, early employees, and investors in a successful for-profit acquire capital, credibility, and influence that they can reinvest in safety, whether by starting new ventures, funding others, or shaping policy. This virtuous cycle is largely unavailable to the non-profit founders, unless they later endow a foundation with, as it happens, money from for-profits.

In addition to tangible resources, a mature ecosystem of advisors and support networks exists to help startups succeed. VC funds, often staffed by ex-founders, provide strategic guidance and industry connections that are crucial for closing sales. There are many talented people who understand what startups offer and actively seek them out. An equivalent ecosystem just doesn’t exist for non-profits.

Shaping The Industry From Within

Being inside an industry is fundamentally different from being adjacent to it.

Embedding an organisation inside AI ecosystems enables both better information gathering and opportunities for intervention. If you can build safe products appropriate to the problems in an industry, you allow companies to easily purchase safety. If companies can purchase safety, then governments can mandate safety. But to get there, it is not enough to make this technology exist; the technology must be something you can buy.

Cloudflare started as a CDN. By becoming technically integrated, they slowly transformed into part of the critical infrastructure of the internet. Now, they make security decisions which shape the entire internet and impact billions of users every day. A safety-focused company embedded in AI infrastructure could do the same.

Will Markets Corrupt Safety?

Market incentives are not purely aligned with safety. The drive to improve capabilities, maximise revenue, and keep research proprietary will harm a profit-seeking organisation’s ability to make AI safer.

However, every institution has its pathologies. The incentives steering research-driven non-profits and academics are not necessarily better.

Pure Research Also Has Misaligned Incentives.

The incentives of safety and capitalism rarely align. The pressure to drive revenue and ship fast pushes towards recklessly cutting corners, and building what your customers demand in the short term rather than investing in long term safety.

However, research organisations have similar harmful incentives driving them away from research which is productive in the long term. The need to seek high-profile conference publications, pleasing grant makers, and building empires. Incentives of research organisations and individual researchers are notoriously misaligned with funders goals’ in academia and industry alike. Pursuing a pure goal with limited feedback signals is extremely difficult as an organisation, regardless of structure.

Ideally, we would have both. For-profits which can use revenue as feedback and learn from market realities, alongside non-profits which can take longer-term bets on work needed for safety. The question is how to build a working ecosystem, not whatever is more purely focused on safety.

Proprietary Knowledge Is Not Always Hoarded.

For-profits have an incentive to keep information hidden to retain a competitive advantage. This could block broader adoption of safety techniques, and restrain researchers from making optimal progress.

Assuming that for-profits add resources and people to the AI safety ecosystem, rather than simply moving employees from non-profits, this is still advantageous. We are not choosing between having this research out in the open or hidden inside organisations. We are choosing between having this research hidden or having it not exist at all. In many sectors, the price of innovation is that incumbents conceal and extract rents from their IP for years.

Despite this, for-profits do have agency over what they choose to publish. Volvo famously gave away the patent to their three-point seatbelt at the cost of their own market share, saving an estimated 1 million lives. Tesla gave away all of their electric vehicle patents to help drive adoption of the technology, with Toyota following suit a few years later. Some of this additional knowledge created by expanding the resources in AI safety may still wind up in public hands.

Markets Force Discovery Of Real Problems.

The constant drive to raise money and make a profit is frequently counter to the best long-term interests of the customer. Investment which should be put into making a product safer today instead goes into sales teams, salaries, and metrics designed to reel in investors. It is true that many startups which begin with a strong safety thesis will drift into pure capabilities work or adjacent markets which show higher short-term growth prospects.

However, many initiatives operating without revenue pressure, such as researchers on grants or philanthropically-funded non-profits, can work for years on the wrong problem. For-profits will be able to see that they are working on the wrong thing, and are driven by the pressure to raise revenue to work on something else.

This is not to say that researchers are doing valueless work simply because they are not receiving revenue in the short term. Plenty of work should be done to secure a prosperous future for humanity which businesses will not currently pay for. Rather, mission drift is often a feature rather than a bug when your initial mission was ill-conceived. The discipline markets provide, forcing you to find problems people will pay to solve, is valuable.

Failure Is A Strong Signal.

The institutional failure modes of non-profits and grant-funded research are mostly benign. The research done is not impactful, and time is wasted. On the other hand, for-profits can truly fail in the sense that they fail to drive revenue and go bankrupt, or they can fail in more spectacular ways where they acquire vast resources which are misallocated. The difference is not that for-profits are inherently more likely to steer from their initial goals.

Uncertainty about impact is common across approaches. Whereas research that goes unadopted fails silently, and advocacy which fails to grab attention disappears without effect, for-profits are granted the opportunity to visibly and transparently fail. The AI safety ecosystem already funds work which fails silently, and is effectively taking larger risks with spending than we realise. Startups aren’t any more likely to fail to achieve their goals; they are in the pleasant position of knowing when they have failed.

Visible failure generates information the ecosystem can learn from. Silent failure vanishes unnoticed.

Your Counterfactual Is Larger Than You Think.

Markets are not efficient. The economy is filled with billion-dollar holes, which are uncovered not only by shifts in the technological and financial landscape but by the tireless work of individuals determined to find them. Just because there is money to be made by providing safety does not mean that it will happen by default without you.

Stripe was founded in 2010. Online payments had existed since the 1990s, and credit-card processing APIs were available for years. Yet it took until 2010 for someone to build a genuinely developer-friendly API, simply because nobody had worked on the problem as hard and as effectively as the Collison brothers.

Despite online messaging being widely available since the 1980s, Slack wasn’t founded until 2013. The focus, grit, and attention of competent people being applied to a problem can solve issues where the technology has existed for decades.

Markets are terrible at pricing in products which don’t exist yet. Innovation can come in the form of technical breakthroughs, superior product design, or a unique go-to-market strategy. In the case of products and services relevant to improving AI safety, there is an immense amount of opportunity which has appeared in a short amount of time. You cannot assume that all necessary gaps will be filled simply because there is money to be made there.

If your timelines are short, then the imperative to build necessary products sooner rather than later grows even greater. Even if a company is inevitably going to be built in a space, ensuring that it is built 6 months sooner could be the difference between safety being on the market and unsafe AI deployment being the norm.

For many, the alternative to founding a safety company is joining a frontier lab. However, most AI deployment happens outside labs in enterprises, government systems, and consumer-facing applications. If you want to impact how AI meets the world, you may have to go outside of the lab to do it. Your marginal impact inside a large organisation is often, counterintuitively, smaller than your marginal impact on the entire world.

Historical Precedents

History is littered with examples of companies using their expertise and market position to ship safety without first waiting around for permission.

Sometimes this means investing significant resources and domain expertise to develop something new.

Three point seatbelt: Volvo developed the three point seatbelt and gave away the patent. Their combination of in-house technical expertise and industry credibility enabled a safety innovation that transformed the global automotive industry.
Toyota’s hybrid vehicle patents: Toyota gave away many hybrid vehicle patents in an attempt to accelerate the energy transition.
Meta’s release of Llama3: At a time when only a small number of organisations had the resources to train LLMs from scratch, Meta open-sourced Llama3 making it available to safety researchers at a time when little else was in public hands.

Or perhaps the technology already exists, and what matters is having the market position to distribute it or the credibility to change an industry’s standards.

Levi-Strauss supply chain audit: At the peak of their market influence, Levi-Strauss audited their supply chain insisting on certain minimum worker standards to continue dealing with suppliers. They enforced workers rights’ in jurisdictions where mistreatment of employees was either legal or poorly monitored, doing what governments couldn’t or weren’t prepared to do.
Cloudflare’s Project Galileo: Cloudflare provides security for small, sensitive websites at no cost. This helps journalists and activists operating in repressive countries avoid being knocked off the internet, and is entirely enabled by Cloudflare’s technology.
WhatsApp end-to-end encryption: The technology existed, and the cryptography research was mature by this point. WhatsApp just built it into their product, delivering privacy protection to billions of users worldwide.
Security for fingerprint and face recognition: Apple stores face and fingerprint data in a separate chip, making it impossible to steal or legally demand. This did not require regulation; this decision actually led to clashes with the US government. Because of their market position, Apple was able to push this security feature and protect hundreds of millions of users unilaterally.

All of these required a large company’s resources, expertise, credibility, and market integration to create and distribute valuable technology to the world.

Building a for-profit which customers depend on, be it for observability, routing, or safety-tooling, lets you ship safety improvements directly into the ecosystem. When the research exists and the technology is straightforward, a market leader choosing to build it may be the only path to real-world implementation.

It’s Up To You.

For-profits are in a fundamentally strong position to access capital, talent, and information. By selling to other businesses and becoming integrated in AI development, they can not only identify the most pressing issues but directly intervene in them. They build the technological and social environment that makes unsafe products unacceptable and security a commodity to be purchased and relied upon.

Non-profits have done, and will continue to do, critical work in AI safety. But the ecosystem is lopsided. We have researchers and advocates, but not enough builders turning their insights into products that companies buy and depend on. The feedback loops, distribution channels, and ability to rapidly scale that for-profits provide are a necessity if safety is to keep pace with capabilities.

The research exists. The techniques are maturing. Historical precedents show us that companies embedded in an industry can ship safety in ways that outsiders cannot. What’s missing are the people willing to found, join, and build companies that close the gap between safety as a research topic and safety as a market expectation. We cannot assume that markets will bridge this divide on their own in the time we have left. If you have the skills and the conviction, this is a gap you can fill!

If you’re thinking about founding something, joining an early-stage AI safety company, or want to pressure-test an idea - reach out at team@bluedot.org. We’re always happy to talk.

BlueDot’s AGI Strategy Course is also a great starting point - at least 4 startups have come out of it so far, and many participants are working on exciting ideas. Apply here.

Thanks to Ben Norman, Daniel Reti, Maham Saleem, and Aniket Chakravorty for their comments.

Lysander Mawby is a graduate of BlueDot’s first Incubator Week, which he went on to help run in v2 and v3. He is now building an AI safety company and taking part in FR8. Josh Landes is Head of Community and Events at BlueDot and, with Aniket Chakravorty, the initiator of Incubator Week.

[-]cousin_it2mo*122

I'm very iffy about this. From the prospects of the Pentagon rolling out worldwide LLM-powered surveillance, to corporations rolling out super-persuasive ads, to labs saying "woohoo a new alignment technique, now we can race harder and keep the same risk", it looks more and more likely that difficulties of AI development (including misalignment) are an important brake and our lease on life. If one day AI stops messing up and hallucinating and so on, that day we're fucked.

[-]LTM1mo10

I'm not convinced that the labs are able to justify their race behaviour because they have access to alignment research. It seems to me that the race is justified by money, gestures to national security, or a vague bias towards building rather than some guarantee that this technology helps humanity. If AI stops making obvious mistakes, it might become slightly harder to galvanise public support for a pause, but I doubt overall it would have a high impact on the trajectory of the technology.

I think of the push towards AI safety companies as improving access to capital, information, and integration for safety work. The ability to integrate monitoring solutions into sensitive industries, catch jailbreaks at runtime, or continually evaluate models is extremely valuable, and I claim that this is easier to do using a for-profit model.

[-]MichaelDickens2mo50

So far, AI safety companies have a very bad track record of sticking to their guns on safety. They almost inevitably devolve into making things worse. Not to say it's impossible for a good AI safety startup to exist, but I would be EXTREMELY wary of any startup in the space, and they would need to clear a very high bar to demonstrate their integrity and why their "AI safety for profit" plan is going to work when it failed so many times before.

[-]LTM1mo20

This is a sentiment that I've heard often when discussing AI safety for-profits; that they often abandon their original mission in favour of a neutral or actively harmful objective. I am not aware of enough examples to treat this as an established pattern.

Anthropic is the strongest example, starting explicitly as a safety company and now clearly accelerating the frontier. However, they have used their position to recently advocate for restricted use of autonomous weapons and a curtailment of mass surveillance. Anthropic shows that this kind of influence is possible (although whether they have been net positive is a more complicated question). I am not aware of enough examples of AI safety for profit startups being derailed to agree that this is a structural flaw we should treat as disqualifying.

By some estimates the ratio of AI safety to capabilities funding is roughly 1:250. In this kind of situation, it seems less important to find approaches which are likely not net negative (such as many AI safety research orgs) and more important to find approaches which could be strongly net positive. As I outlined, I think that poor alignment exists in research and advocacy orgs as well and that for-profits are not uniquely pre-disposed to harmful effects. I believe that there is a significant cost to being so risk-averse about a whole category of intervention.

I'm curious about your thoughts on:

Evidence for frequent failures of AI safety startups. Many people I have spoken to believe this, although I don't see the examples I would expect to. (Depending on your views of frontier labs, these may be a small number of failures which nevertheless give the for-profit structure a strongly net negative impact. However, this specific accelerationist failure mode might be unique to the lab structure rather than companies in aggregate?)
How these ideas translate to research and advocacy orgs. What bar for integrity and clarity of plan should we demand from them? Non-profits have their own misalignment pressures, and I'm not sure the bar applied to them is equally demanding.