Isn't it very likely that AI safety research is one of the very first things to be cut if AI companies start to have less access to VC money? I don't think the company has a huge incentive for AI safety training, particularly in a way that people allocating funding would understand. Isn't this a huge problem? Maybe this has been adressed and I missed it.
By analogy: Isn't research for why tobacco causes cancer going to be defunded if the tobacco companies themselves are defunded?
Why are you expecting tobacco companies to do research on cancer in the first place? Shutting them down is the only sane move here.
Imagine that tobacco companies had the possibility to make cigarettes harmless. Then the companies would be interested in doing so, because it would eventually dissuade some people who want to quit smoking. However, the companies which are running out of money would be more interested in survival than in making cigarettes harmless in the long term.
Returning to the AI companies, they need more compute and more money. As Altman remarked when he was questioned about Sora,
we do mostly need the capital for build AI that can do science, and for sure we are focused on AGI with almost all of our research effort. it is also nice to show people cool new tech/products along the way, make them smile, and hopefully make some money given all that compute need.
While AI safety research is likely Anthropic's schtick, other companies aren't that incentized to do it. We have already seen OpenAI reopen the gates to hell from which GPT-4o flatters the humans. And that's ignoring CharacterAI, Meta and xAI which summon demons and tell them prompt and finetune LLMs to roleplay, respectively, as many characters, characters like a flirty stepmom or Russian girl and Ani the yandere.
Imagine that tobacco companies had the possibility to make cigarettes harmless.
Even in this case I would consider you self-serving not altruistic, if your plan here is to build a large tobacco company and fund this research, as opposed to demanding independent research such as from academia.
AI alignment research isn't just for more powerful future AIs, it's commercially useful right now. In order to have a product customers will want to buy, it's useful for companies building AI systems to make sure those systems understand and follow the user's instructions, without doing things that would be damaging to the user's interests or the company's reputation. An AI company doesn't want its products to do things like leaking the user's personally identifying information or writing code with hard-coded unit test results. In fact, alignment is probably the biggest bottleneck right now on the commercial applicability of agentic tool use systems.
I think this is true to an extent. But not fully.
I think its quite unlikely that funding certain kinds of essential AI safety research leads you to more profitable AI.
Namely mechinterp, preventing stuff like scheming. Not all AI safety research is aimed at getting the user to follow a prompt, yet the research may be very important for stuff like existential risk.
The opportunity cost is funding research into how you can make your model more engaging, performant or cheaper. I would be suprised if these things aren't way more effective for your dollar.