Dario rejects doomerism re:misalignment, fair enough.
But what of doomerism re:slowdown?
Furthermore, the last few years should make clear that the idea of stopping or even substantially slowing the technology is fundamentally untenable. The formula for building powerful AI systems is incredibly simple, so much so that it can almost be said to emerge spontaneously from the right combination of data and raw computation.
[...]
If all companies in democratic countries stopped or slowed development, by mutual agreement or regulatory decree, then authoritarian countries would simply keep going. Given the incredible economic and military value of the technology, together with the lack of any meaningful enforcement mechanism, I don’t see how we could possibly convince them to stop.
Predicting the difficulty of cooperating around / enforcing a "substantial" slow down of AI development seems similarly difficult as predicting the difficulty of avoiding misalignment? Perhaps it is true that this would be historically unprecedented, but as Dario notes, the whole possibility of a country of geniuses in a datacenter is historically unprecedented.
I would love to hear arguments from the slowdown-pessimistic worldview generally.
Specifically addressing perhaps:
1. How do we know we have exhausted / mostly saturated communicating about the risks within the ~US, such that further efforts on that front wouldn't lead to much meaningful returns?
- (Sure, it does seem like in the current administration has very little sympathy to anything like a slowdown, but it was not the case for the previous admin? Isn't there a lot of variance regarding this?)
2. How do we know geopolitical adversaries wouldn't agree to a slowdown, if this was seriously bargained for by a coalition of the willing?
3. What is the estimation about how the situation regarding the above two would change, if there was significantly more legible evidence and understanding about the risks? What about the case of a "warning shot"?
What is the level of evidence likely required for policymakers to seriously consider negotiating a slowdown?
4. How difficult would be the oversight or enforcement of a slowdown policy?
(Even if adversaries in a few years develop their independent chip-manufacturing supply chains, isn't training "powerful AIs" remaining a highly resource intensive, observable and disruptable process for likely ~decades?)
5. How much early not-yet-too-powerful AIs might help us with coordination and enforcement of a slowdown?
That's why it's called... *scalable* alignment? :D
Somewhat tongue-in-cheeck, but I think I am sort of confused by what is the core news here.
Secondly, OpenAI had complete access to the problems and solutions to most of the problems. This means they could have actually trained their models to solve it. However, they verbally agreed not to do so, and frankly I don't think they would have done that anyway, simply because this is too valuable a dataset to memorize.
Now, nobody really knows what goes on behind o3, but if they follow the kind of "thinking", inference-scaling of search-space models published by other frontier labs that possibly uses advanced chain-of-thought and introspection combined with a MCMC-rollout on the output distributions with a PRM-style verifier, FrontierMath is a golden opportunity to validate on.
If you think they didn't train on FrontierMath answers, why do you think having the opportunity to validate on it is such a significant advantage for OpenAI?
Couldn't they just make a validation set from their training set anyways?
In short, I don't think the capabilities externalities of a "good validation dataset" is that big, especially not counterfactually -- sure, maybe it would have took OpenAI a bit more time to contract some mathematicians, but realistically, how much more time?
Whereas if your ToC as Epoch is "make good forecasts on AI progress", it makes sense you want labs to report results on your dataset you've put together.
Sure, maybe you could commit to not releasing the dataset and only testing models in-house, but maybe you think you don't have the capacity in-house to elicit maximum capability from models. (Solving the ARC challange cost O($400k) for OpenAI, that is peanuts for them but like 2-3 researcher salaries at Epoch, right?)
If I was Epoch, I would be worried about "cheating" on the results (dataset leakage).
Re: unclear dataset split: yeah, that was pretty annoying, but that's also on OpenAI comms too.
I teeend to agree that orgs claiming to be safety orgs shouldn't sign NDAs preventing them from disclosing their lab partners / even details of partnerships, but this might be a tough call to make in reality.
I definitely don't see a problem with taking lab funding as a safety org. (As long as you don't claim otherwise.)
(indeed the politics of our era is moving towards greater acceptance of inequality)
How certain are you of this, and how much do you think it comes down more to something like "to what extent can disempowered groups unionise against the elite?".
To be clear, by default I think AI will make unionising against the more powerful harder, but it might depend on the governance structure. Maybe if we are really careful, we can get something closer to "Direct Democracy", where individual preferences actually matter more!
[sorry, have only skimmed the post, but I feel compelled to comment.]
I feel like unless we make a lot of progress on some sort of "Science of Generalisation of Preferences", for more abstract preferences (non-biological needs mostly fall into this), even if certain individuals have, on paper, much more power than others, at the end of the day, they likely rely on vastly superintelligent AI advisors to realise those preferences, and at that point, I think it is the AI advisor _really_ in control.
I'm not super certain of this, like, the Catholic Church definitely could decide to build a bunch of churches on some planets (though what counts as a church, in the limit?), but if they also want more complicated things like "people" "worshipping" "God" in those churches, it seems to be more and more up to the interpretation of the AI Assistants building those worship-maximising communes.
I have read through most of this post and some of the related discussion today. I just wanted to write that it was really interesting, and as far as I can tell, useful, to think through Paul's reasoning and forecasts about strategy-related questions.
In case he believes this is a good idea, I would be very glad to read through a longer, more comprehensive document describing his views on strategic considerations.
Sooo this was such an intriguing idea that I did some research -- but reality appears to be more boring:
In a recent informal discussion I believe said OPP CEO remarked he had to give up the OpenAI board seat as his fiancée joining Anthropic creates a conflict of interest. Naively this is much more likely, and I think is much better supported by the timelines.
According to LinkedIn of the mentioned fiancée joined in already as VP in 2018 and was promoted to a probably more serious position in 2020, and her sibling was promoted to VP in 2019.
The Anthropic split occurred in June 2021.
A new board member (who is arguably very aligned to OPP) was inducted in September 2021, probably in place of OPP CEO.
It is unclear when OPP CEO exactly left the board, but I would guess sometime in 2021. This seem better explained by "conflict of interest with his fiancée joining-cofounding Anthropic" and OpenAI putting an other OPP-aligned board member in his place wouldn't make for very productive scheming.
I think the point was less about a problem with refugees (which should be solved in time with European coordination), maybe more that the whole invasion is "good news" for conservative parties, as most crises are.
A lot of people brought up sanctions, and they could indeed influence European economy/politics.
I would be curious about what sanctions in particular are likely to be implemented, and what are their implications - a major economic setback/energy prices soaring could radicalize European politics perhaps?
My guess would be that overall the whole event increases support for conservative/nationalist/populist parties - for example, even though Hungary's populist government was trying to appear to be balancing "between the West and Russia" (thus now being in an uncomfortable situation), I think they can probably actually spin it around to their advantage. (Perhaps even more so, if they can fearmonger about refugees.)
It is a bit ambiguous from your reply whether you mean distributed AI deployment, or distributed training. Agree that distributed deployment seems very hard to police, once training took place, implying also that there is some large amount of compute available somewhere.
About training, I guess the hope for enforcement would be ability to constrain (or at least monitor) total compute available and hardware manufacturing.
Even if you do training in a distributed fashion, you would need the same amount of chips. (Probably more by some multiplier, to pay for increased latency? And if you can't distribute it to an arbitrary extent, you still need large datacenters that are hard to hide.)
Disguising hardware production seems much harder than disguising training runs or deployment.
Perhaps a counter is "algorithmic improvement", which is estimated by Epoch to be providing 3x/year effective compute gain.
This is important, but:
- Compute scaling is estimated (again, by Epoch) at 4-5x/year, so if we assume both trends to continue, and if your timeline for dangerous AI was say, 5 years, and we freeze compute scaling today such that only the largest training run today is available in the future, IIUC you would gain ~7 years (which is something!)
But, importantly, the longer timelines you have, if I did the math correctly, you have linearly ~1.5x extra time.
(So, if your timeline for dangerous AI was 2036, it would be pushed out to ~2050.)
- I'm sceptical that "algorithmic improvement" can be extrapolated indefinitely -- it would be surprising if you could train GPT-3 in ~8 years on a single V100 GPU in a few months? (You need to get a certain amount of bits into the AI, there is now way around it.)
(At least this should be more and more true as labs reap more and more of the low-hanging fruit of hardware optimisation?)
Also, contingent on point 2. of my original comment, all of the above could be much much easier if we are not assuming a 100% adversarial scenario, where the adversary has willingness to cooperate in the effective implementation of a treaty.