One thing that could explain the lack of safety blog-posts for anthropic is that a lot of their blog-posts appear only on https://alignment.anthropic.com/ and not on https://www.anthropic.com/research. So it seems like your scraper (which I think only goes through the latter) undercounts Anthropic's safety blog-posts.
It would be useful if you could share the dataframe/csv you generated which had the blog-posts and Gemini's classification!
Claude 4.5 generated a list of blog-posts that only appeared in the former link:
Examples of blog-only posts include:
- Activation Oracles
- Towards training-time mitigations for alignment faking in RL
- Open Source Replication of the Auditing Game Model Organism
- Anthropic Fellows Program announcements
- Evaluating honesty and lie detection techniques
- Strengthening Red Teams
- Stress-testing model specs
- Inoculation Prompting
- Findings from a Pilot Anthropic–OpenAI Alignment Evaluation Exercise
- Subliminal Learning
- Do reasoning models use their scratchpad like we do?
- Publicly Releasing CoT Faithfulness Evaluations
- Putting up Bumpers
- How to Replicate and Extend our Alignment Faking Demo
- Three Sketches of ASL-4 Safety Case Components
- And many others
[This is a cross-post from here. Find the code used to do the analysis here.]
Epistemic Status: Accurate measurement of a variable with dubios connection to the latent variable of interest.
What share of AI companies' research portfolio should be dedicated to AI safety? This is one of the most important questions of our time, and not easy to answer. To reveal my motivation here, I think this share should be very high. Instead of arguing the for and against, let's first answer the much simpler question of what that share is in the first place.
In my social circle, it is generally believed that there is a hierarchy of which companies dedicate more and less resources to making AI go well. It might be summarized as Anthropic > Deepmind = OpenAI = Thinking Machines > Mistral = Meta = x.AI > Zhipu = DeepSeek = Alibaba. Here, we'll find out whether the volume of publications matches up with those intuitions, specifically about OpenAI, Anthropic and Google Deepmind.
We programmatically go through every publication on the OpenAI Blog, Anthropic Blog and Deepmind's publications index. Other companies that would be interesting to look at don't have as nice of a collection of aggregated publications, and are less important, so we are content with these three. We get 59 blog posts from OpenAI, from 2016 to 2025, 86 from Anthropic, 2021 to 2025, and 233 papers from Deepmind, though their current index only starts in 2023.
For each research output, we scrape the title and date. We then put the titles through Gemini-Flash-3, prompting it to assign a probability distribution over the topic being (safety, capabilities, or other). We classify into safety- or non-safety-related articles by rounding the safety probability to a binary indicator. We then estimate the probability of a research output being about safety in each time point. We assume continuity in time and model with a piecewise-linear b-spline regression with binomial response, separately for each company. We compute confidence intervals at level 0.1.
The difference between Deepmind and OpenAI/Anthropic should be discounted because putting something on a paper index is not the same as putting a blog post on the company website's front page. In particular, the share for Deepmind seems more reflective of the true share of researcher time dedicated to safety in comparison to the two others. Also, it seems like OpenAI has a higher bar for putting out blog posts in comparison to Anthropic. Note further that confidence intervals even at level 0.1 overlap, or almost overlap. Still I sense a contradiction between the data and the public opinion regarding each company.
OpenAI seems comparatively much better than it is credited for. Perhaps more importantly, it is improving. Critics might call some of their alignment work behind or even derivative of Anthropic (e.g. RSP vs. Preparedness, Aardvark vs. PETRI), but in terms of quantity things are starting to look a little better. This would be expected, rational behavior as their models become more powerful.
Deepmind also seems to be improving slightly, though seeing it this directly does clarify how much of their work is about applications and experimental capabilities. This also matches up with the reports of safety researchers there who I personally know, who seem to report higher difficulty to get resources or permission to publish. It seems reasonable to expect a more credible commitment to safety vs. the other companies.
The biggest surprise in this data is the very robust downwards trend for Anthropic. It could be that the share of safety research hasn't changed, and it's simply that the part of the organization responsible for capabilities (e.g. Claude Code) has become more vocal and interesting for the public. Still, I feel comfortable concluding that Anthropic's reputation as the safety company is mostly a result of the 2023 era. If people were to freshly evaluate the two companies just by their output today, they might end up ranking both companies equally. Let's hope that the negative trend for Anthropic does not continue into the future, to the extent is measuring something interesting.
Again, the biggest fault of this analysis (as often in statistics) is treating each output as an identical observational unit, even though those units mean something quite different between the three companies. A better version would go through preprints instead of blog posts, and perhaps weight by author count or some other measure of estimated research effort. This would also enable comparing companies that don't keep a neat registry of their research. Finally, more samples would be good to increase the power. If you want to see that, comment or write me an email, at lennart@finke.dev. A much better version of what we did here is the Future of Life Institute's AI Safety Index.