I'm wondering whether there is consensus on the net value/detriment of these two AI activities:

  1. Integrations: plugging in new AIs into various places. E.g. Taking a cool new LLM model and building profitable products with it.
    1. Con: doing AI integrations that improve products increases the relative value of AI products. This means a marginally higher capabilities investment.
  2. Consuming overhangs: Taking a cool new LLM model and seeing how much power you can squeeze from it if you throw, for example, decision trees at it, or making its training set smaller/more efficient, etc.
    1. Pro: if we leave a lot of overhangs until we get to later-stage AI algorithms, we could see much bigger jumps in capability that we didn't expect.
    2. Con: Consuming overhangs makes AIs more useful, which increases the desire to invest in capabilities on net. It may also speed up AI development at some point: we're getting to the point where AIs can start to contribute (at least somewhat) to capability gains. An AI that was made more powerful by consuming overhangs accelerates all fields that the AI has capabilities in. Even if it speeds up alignment and capabilities by the same factor, making the timeline to AGI shorter is likely on net worse due to other factors beyond narrow alignment research not being ready yet (political, social, etc).

I am not sure whether there is a neat theoretical answer to these questions. It might be an empirical question of where the equilibrium falls. Still, would love to read some thoughts on this, and especially pointers to prior art, because a lot of engineers and researchers are being thrown at both integrations and overhangs. If either of these are net-harmful, it's important to be able to steer people away from them.

New Answer
New Comment

1 Answers sorted by

Ann

10

I think you may also want to consider the dependency consequences when it comes to APIs and integrations.

Once you've integrated a piece of software into another piece of software, you generally want it to stay the same and stay available. So you're somewhat invested in putting compute for maintenance over capabilities.

A second note is that overhang consumption on less powerful models doesn't necessarily transfer as well to more powerful. The techniques become redundant with existing capabilities, and your skill there remains more useful for getting more out of less powerful.

Can't control the hype, but from a software development perspective, trying to develop integrations and consume overhangs will likely push towards wanting more stable capabilities over time and more gradual transitions.

From my perspective, trying to actually extract value from what we have with invested time and money is going to push base capabilities pace slower if anything. And understanding what we have already wrought and how it can be utilized is important.