ankitmaloo's Shortform

ankitmaloo

This is a special post for quick takes by ankitmaloo. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

I have been in Bay area for a month so take this with as much salt as you deem appropriate.

The fundamental question I have been grappling with since I am here is if there is enough space for another AI lab. To articulate my proposition more clearly: historically, when it comes to scientific breakthroughs, any major breakthrough answers one question, and then creates 10 new question. One lab, with funding, would not focus on all those questions at once, so new labs spawn up with narrower focus tackling only one of the question. It's efficient, it moves the world forward, and I have a feeling it's faster.

Looking at a research level, is it possible the same trend would work in AI too? Transformers led to Anthropic, Google, Cohere, Meta FAIR, Deepseek, SSI, Thinking Machines and Deepmind itself all working on some notion of AGI. In the last few months itself, we have had frontiers like Reinforcement Learning, Behavioral Foundational Models, alongside alignment, safety, Interpretability etc.

My sense is that a separate lab focused on (for example) Alignment would be able to get to results faster than say something like an Anthropic or OpenAI. Reasons:

This is a typical theoretical divergence. Methods needed to make a model safe and more aligned are vastly different from making them better. Research at the frontier has become narrower as we go forward. Methodologies start differing and are not as transferable as these labs would like.
This is also the case where a company with a given amount of compute will have to prioritize between allocating it to serving older models, serving latest models (more compute hungry), attract more customers, serving latest research. All this to say, an alignment researcher has to compete against 10 different priorities. Inevitably revenue centric use cases win out. (case in point: what happened at openai).
This allows for collection for people who want to work on a similar problem together. This gets to results faster in many scenarios. This lab being independent can work with more companies and can come across more scenarios leading to better modeling.

The same goes with something like Reinforcement learning[1] (currently with verifiable rewards, but there are interesting problems like Sample efficiency, explainability, self learning, in domains where rewards are sparse and so on) , Continual Learning (without catastrophic forgetting), or even methods to make smaller models more powerful, finding better solutions than RAG for knowledge addition in models, and so on. The claim is that there is no longer a space for a lab with wider focus, but likely, narrowly focused labs have an opportunity.

Another limiting factor here is compute. However, the problem is not as acute as 2023 or even mid 2024, and likely newer labs can get compute at cheaper rates and higher availability.

A lot of what i wrote is implicitly discouraged it seems in my interactions in SV. Research by definition is slow, and people want to move fast. I understand that. Hence this is a question, not a justification yet.

[1]: I know all three big labs have this as a huge focus area, but I feel they are focused on generalization aspect. There is a lot of work to be done on the research side in order to make the current format useful for enterprises, which does not seem to be a huge focus (based on my limited conversations).

mostly blogging at https://ankitmaloo.com (not a frequent blogger, just that I write stuff occasionally)