O O - LessWrong

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

That is great to hear, but I find it probable they’ll be ignored/lobbied against/gamed when it goes against business interests.

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

O O19h11

Unfortunately, I don't have enough inside access for that...

Yeah, with you there. I am just speculating based on what I've heard online and through the grapevine, so take my model of their internal workings with a grain of salt. With that said I feel pretty confident in it.

if one is after VC funding, one needs to show those VCs that there is some secret sauce which remains proprietary

IMO software/algorithmic moat is pretty impossible to keep. Researchers tend to be pretty smart, enough to figure it out independently, even if they manage to stop any researcher from leaving and diffusing knowledge. Some parallels:

The India trade done by Jane Street. They ~~are~~ were making billions of dollars contingent on the fact that no one else knows about this trade, but eventually their alpha also got diffused.
TikTok's content algorithm which the Chinese government doesn't want to export only took a couple months for Meta/Google to replicate.

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

O O1d10

And I doubt that Microsoft or Google have a program dedicated to "trying everything that look promising", even though it is true that they have manpower and hardware to do just that. But would they choose to do that?

Actually I'm under the impression a lot of what they do is just sharing papers in a company slack and reproducing stuff at scale. Now of course they might intuitively block out certain approaches that they think are dead-ends but turn out to be promising, but I wouldn't underestimate their agility at adapting new approaches if something unexpected is found.^[1] My mental model is entirely informed from seeing Dwarkesh's interview with DeepMind researchers. They talk about ruthless efficiency in trying out new ideas and seeing what works. They also talk about how having more compute would make them X times better researchers.

Yes, of course, we are seeing a rich stream of promising new things, ranging from evolutionary schemas (many of which tend towards open-endedness and therefore might be particularly unsafe, while very promising) to various derivatives of Mamba to potentially more interpretable architectures (like Kolmogorov-Arnold networks or like recent Memory Mosaics, which is an academic collaboration with Meta, but which has not been a consumer of significant compute yet) to GFlowNet motifs from Bengio group, and so on.

I think these are all at best marginal improvements and will be dwarfed by more compute^[2] or at least will only beat SOTA after being given more compute. I think the space for algo improvement for a given amount of compute is saturated quickly. Also if anything, the average smaller place will over-index on techniques that crank out a little extra performance on smaller models but fail at scale.

Of course, if a place with more hardware decides to adopt a non-standard scheme invented by a relatively hardware-poor place, the place with more hardware would win.

My mental model of the hardware poor is they want to publicize their results as fast as they can so they get more clout, VC funding, or just getting essentially acquired by big tech. Academic recognition in the form of citations drive researchers. Getting rich drives the founders.

The key here is that when people discuss "foom", they usually tend to focus on a (rather strong) argument that AGI is likely to be sufficient for "foom". But AGI is not necessary for "foom", one can have "foom" fully in progress before full AGI is achieved ("the road to superintelligence goes not via human equivalence, but around it").

Yes I agree there is a small possibility, but I find this is almost "pascal mugging". I think there is a stickiness of the AlphaGo model of things that's informing some choices which are objectively bad in a world where the AlphaGo model doesn't hold. The fear response to the low odds world is not appropriate for the high odds world.

^{^}
I think the time it takes to deploy a model after training is making people think these labs are slower than they actually are.
^{^}
As an example most improvements from Llama-3 came from just training the models on more data (with more compute). Sora looks worse than SOTA approaches until you throw more compute at it.

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

O O1d30

The diversity of opinion in the AI existential safety community is at least as big (and is probably even larger, which is natural given that the field is much younger, with its progress being much less certain), but, in addition to that, the diversity is less obfuscated, because it does not have anything resembling the Transformer-based LLM highly successful center around which people can consolidate.

I still don't see how you can validate any alignment ideas without having a lot of compute to test them out on precursor models. Or how you can validate them without training misaligned models that won't be released to the public. That's why I have almost 0 faith in any of these smaller players. Maybe it's good for small players to publish research so big labs can try to repro them. In that case, I still don't see why you would leave your lab to do that. Your idea is much more likely to be reproduced if you influence training runs or have greater access to more diverse models at a lab. I take this as part of why for example Neel Nanda joined an AI lab.

But is it actually that unquestionable? Even with Microsoft backing OpenAI, Google should have always been ahead of OpenAI, if it were just a matter of raw compute.

Since hardware jumps are exponential, even if they don't do "yolo-runs" like OpenAI does where they dedicate a large portion of their compute to risky ideas, just wait a few years until their GPUs/TPUs get better and a gpt-4 sized model is much cheaper to train. I expect Google to race ahead of OpenAI sooner or later, at least until Stargate is finished. Arguably their demos have been more technically impressive, even if OpenAI's demos are shinier looking.

I think that non-standard architectural and algorithmic breakthroughs can easily make smaller players competitive, especially as inertia of adherence to "what has been proven before" will inhibit the largest players.

Do these exist? My model is that most ideas are just come up with experimentally validating papers and hypothesis through trial-and-error (i.e. compute), and OpenAI wasn't a small player in terms of compute. They also used standard architectures and algorithms. They just took more risks than Google. But a risky "yolo run" now is a "standard training run" in a few years.

Even in one or two years, we'll find that the models can do a lot more involved tasks than they can do now. For example, you could imagine having the models carry out a whole coding project instead of it giving you one suggestion on how to write a function. You could imagine the model taking high-level instructions on what to code and going out on its own, writing any files, and testing it, and looking at the output. It might even iterate on that a bit. So just much more complex tasks.

OK, so we are likely to have that (I don't think he is over-optimistic here), and the models are already very capable of discussing AI research papers and exhibit good comprehension of those papers (that's one of my main use cases for LLMs: to help me understand an AI research paper better and faster). And they will get better at that as well.

This really does not sound like AGI to me (or at least highly depends on what a coding project means here) and prediction+stock markets don't buy that this will be AGI either. This just sounds like a marginally better GPT-4. I'd expect AGI to be as transformative as people expect AGI to be when it can automate hardware research.

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

O O1d30

Is there evidence that METR had more than nominal impact? I also think the lack of clout will limit his influence in the government. To some government employee, he’s just someone from a random startup they never heard of having outsized influence. Within that agency he's just a cog in some slow moving behemoth. Within OpenAI he is at least an influential voice in the safety org.

"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"

O O2d30

Additionally, the AI might think it's in an alignment simulation and just leave the humans as is or even nominally address their needs. This might be mentioned in the linked post, but I want to highlight it. Since we already do very low fidelity alignment simulations by training deceptive models, there is reason to think this.

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

O O3d133

https://x.com/janleike/status/1791498174659715494?s=46&t=lZJAHzXMXI1MgQuyBgEhgA

Leike explains his decisions.

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

O O5d-20

He is an ally of Ilya

O O's Shortform

O O6d3-2

Is this paper essentially implying the scaling hypothesis will converge to a perfect world model? https://arxiv.org/pdf/2405.07987

It says models trained on text modalities and image modalities both converge to the same representation with each training step. It also hypothesizes this is a brain like representation of the world. Ilya liked this paper so I’m giving it more weight. Am I reading too much into it or is it basically fully validating the scaling hypothesis?

GPT-4o is out

O O7d42

It was whelming.

LESSWRONG
LW

Posts

Wiki Contributions

Comments