But I don't expect these kinds of understanding to transfer well to understanding Transformers in general, so I'm not sure it's high priority.
The point is not necessarily to improve our understanding of Transformers in general, but that if we're pessimistic about interpretability on dense transformers (like markets are, see below), we might be better off speeding up capabilities on architectures we think are a lot more interpretable.
The idea that EVERY governments are dumb and won't figure out a way which is not too bad to allocate their resources into AGI seems highly unlikely to me. There seems to be many mechanisms by which it could not be the case (e.g national defense is highly involved and is a bit more competent, the strategy is designed in collaboration with some competent people from the private sector etc.).
To be more precise, I'd be surprised if no one of these 7 countries had an ambitious plan which meaningfully changed the strategic landscape post-2030:
I guess I'm a bit less optimistic on the ability of governments to allocate funds efficiently, but I'm not very confident in that.
A fairly dumb-but-efficient strategy that I'd expect some governments to take is "give more money to SOTA orgs" or "give some core roles to SOTA orgs in your Manhattan Project". That seems likely to me and that would have substantial effects.
Unfortunately, good compute governance takes time. E.g., if we want to implement hardware-based safety mechanisms, we first have to develop them, convince governments to implement them, and then they have to be put on the latest chips, which take several years to dominate compute.
This is a very interesting point.
I think that some "good compute governance" such as monitoring big training runs doesn't require on-chip mechanisms but I agree that for any measure that would involve substantial hardware modifications, it would probably take a lot of ...
What I'm confident in is that they're more likely to be ahead than now or within a couple years. As I said, otherwise my confidence is ~35% by 2035 that China catches up (or become better), which is not huge?
My reasoning is that they've been better at optimizing ~everything than the US mostly because of their centralization and norms (not caring too much about human rights helps optimizing) which is why I think it's likely that they'll catch up.
Mostly because they have a lot of resources and thus can weigh a lot in the race once they enter it.
Thanks for your comment!
I see your point on fear spreading causing governments to regulate. I basically agree that if it's what happens, it's good to be in a position to shape the regulation in a positive way or at least try to. I still think that I'm more optimistic about corporate governance which seems more tractable than policy governance to me.
The points you make are good, especially in the second paragraph. My model is that if scale is all you need, then it's likely that indeed smaller startups are also worrying. I also think that there could be visible events in the future that would make some of these startups very serious contenders (happy to DM about that).
Having a clear map of who works in corporate governance and who works more towards policy would be very helpful. Is there anything like a "map/post of who does what in AI governance" or anything like that?
Have you read note 2? If note 2 was made more visible, would you still think that my claims imply a too high certainty?
I hesitated on decreasing the likelihood on that one based on your consideration to be honest, but I still think that 30% of having strong effects is quite a lot because as you mentioned it requires the intersection of many conditions.
In particular, you don't mention which intervention you expect from them. If you take the intervention I took as a reference class ("Constrain labs to airgap and box their SOTA models while they train them”), do you think there are things that are as much or more "extreme" than this and that are likely?
What might ...
Thanks for your comment!
First, you have to have in mind that when people are talking about "AI" in industry and policymaking, they usually have mostly non-deep learning or vision deep learning techniques in mind simply because they mostly don't know the ML academic field but they have heard that "AI" was becoming important in industry. So this sentence is little evidence that Russia (or any other country) is trying to build AGI, and I'm at ~60% Putin wasn't thinking about AGI when he said that.
...If anyone who could play any role at all in develop
[Cross-posting my answer]
Thanks for your comment!
That's an important point that you're bringing up.
My sense is that at the movement level, the consideration you bring up is super important. Indeed, even though I have fairly short timelines, I would like funders to hedge for long timelines (e.g. fund stuff for China AI Safety). Thus I think that big actors should have in mind their full distribution to optimize their resource allocation.
That said, despite that, I have two disagreements:
To get a better sense of people's standards' on "cut at the hard core of alignment", I'd be curious to hear examples of work that has done so.
It would be worth paying someone to do this in a centralized way:
If someone is interested in doing this, reach out to me (campos.simeon @gmail.com)
Do you think we could use grokking/current existing generalization phenomena (e.g induction heads) to test your theory? Or do you expect the generalizations that would lead to the sharp left turn to be greater/more significant than those that occurred earlier in the training?
Thanks for trying! I don't think that's much evidence against GPT3 being a good oracle though, bc to me it's pretty normal that without fine-tuning he's not able to forecast. He'd need to be extremely sample efficient to be able to do that. Does anyone want to try fine-tuning?
Cost: You have basically 3 months free with GPT3 Davinci (175B) (under a given limit but which is sufficient for personal use) and then you pay as you go. Even if you use it a lot, you're likely to pay less than 5$ or 10$ per months.
And if you have some tasks that need a lot of tokens but that are not too hard (e.g hard reading comprehension), Curie (GPT3 6B) is often enough and is much cheaper to use!
In few-shot settings (i.e a setting in which you show examples of something so that it reproduces it), Curie is often very good so it's worth trying it...
Thanks for the feedback! I will think about it and maybe try to do something along those lines!
Are there existing models for which we're pretty sure we know all their latent knowledge ? For instance small language models or something like that.
Thanks for the answer! The post you mentioned indeed is quite similar!
Technically, the strategies I suggested in my two last paragraphs (Leverage the fact that we're able to verify solutions to problems we can't solve + give partial information to an algorithm and use more information to verify) should enable to go far beyond human intelligence / human knowledge using a lot of different narrowly accurate algorithms.
And thus if the predictor has seen many extremely (narrowly) smart algorithms, it would be much more likely to know what is it like to be...
You said that naive questions were tolerated so here’s a scenario I can’t figure out why it wouldn’t work.
It seems to me that the fact that an AI fails to predict the truth (because it predicts as humans would) is due to the fact that the AI has built an internal model of how humans understand things and predict based on that understanding. So if we assume that an AI is able to build such an internal model, why wouldn’t we train an AI to predict what a (benevolent) human would say given an amount of information and a capacity to process information ? Doing...
I think that "There are many talented people who want to work on AI alignment, but are doing something else instead." is likely to be true. I met at least 2 talented people who tried to get into AI Safety but who weren't able to because open positions / internships were too scarce. One of them at least tried hard (i.e applied for many positions and couldn't find one (scarcity), despite the fact that he was one of the top french students in ML). If there was money / positions, I think that there are chances that he would work on AI alignment independently.
Connor Leahy in one of his podcasts mentions something similar aswell.
That's the impression I have.
I think that yes it is reasonable to say that GPT-3 is obsolete.
Also, you mentioned loads AGI startups being created in 2023 while it already happened a lot in 2022. How many more AGI startups do you expect in 2023?