Roko

Wiki Contributions

Comments

Well, there it is.

AI risk has gone from underhyped to overhyped.

Agreed.

The Weak AGI question on metaculus could be solved tomorrow and very little would change about your life, certainly not worth "reflecting on being human" etc.

How could you use this to align a system that you could use to shut down all the GPUs in the world?

I mean if there was a single global nuclear power rather than about 3, it wouldn't be hard to do this. Most compute is centralized anyway at the moment, and new compute is made in extremely centralized facilities that can be shut down.

One does not need superintelligence to close off the path to superintelligence, merely a human global hegemon.

If Russia were to nuke Ukraine with a tactical nuke, they will put the US into a position of being forced to respond.

If we go all the way up the escalation ladder to a full nuclear exchange, it's essentially impossible for Russia to win.

So they probably will need to either not escalate, or plan to deescalate at an intermediate point, e.g. if there's an exchange of tac nukes or a tac nuke is exchanged for a nasty conventional strike, Russia may intend to stop the escalation at that point.

Russia has much more reason to bark about nukes than to bite. The bite might happen but I don't see a strong reason for it.

No I think you misunderstood me: I do agree that things are "getting weird" - I'm just saying that this is to be expected to make the 2040 date.

Even if you did that, you might need a superhuman intelligence to generate tokens of sufficient quality to further scale the output.

Kurzweil predicted a singularity around 2040. That's only 18 years away, so in order for us to hit that date things have to start getting weird now.

I think this post underestimates the amount of "fossilized" intelligence in the internet. The "big model" transformer craze is like humans discovering coal and having an industrial revolution. There are limits to the coal though, and I suspect the late 2020s and early 2030s might have one final AI winter as we bump into those limits and someone has to make AI that doesn't just copy what humans already do.

But that puts us on track for 2040, and the hardware will continue to move forward meaning that if there is a final push around 2040, the progress in those last few years may eclipse everything that came before.

As for alignment/safety, I'm still not sure whether the thing ends up self-aligning or something pleasant, or perhaps alignment just becomes a necessary part of making a useful system as we move forward and lies/confabulation become more of a problem. I think 40% doom is reasonable at this stage because (1) we don't know how likely these pleasant scenarios are and (2) we don't know how the sociopolitical side will go; will there be funding for safety research or not? Will people care? With such huge uncertainties I struggle to deviate much from 50/50, though for anthropic reasons I predicted a 99% chance of success on metaculus.

as we get into more complex tasks, getting AI to do what we want becomes more difficult

I suspect that much of the probability for aligned ASI comes from this. We're already seeing this with GPT ; it often confabulates or essentially simulates some kind of wrong but popular answer.

1.4Q tokens (ignoring where the tokens come from for the moment), am I highly confident it will remain weak and safe?

I'm pretty confident that if all those tokens relate to cooking, you will get a very good recipe predictor.

Hell, I'll give you 10^30 tokens about cooking and enough compute and your transformer will just be very good at predicting recipes.

Next-token predictors are IMO limited to predicting what's in the dataset.

In order to get a powerful, dangerous AI from a token-predictor, you need a dataset where people are divulging the secrets of being powerful and dangerous. And in order to scale it, you need more of that.

So we cannot "ignore where the tokens come from" IMO. It actually matters a lot; in fact it's kind of all that matters.

Answer by RokoAug 02, 20212

what can I infer?

If you are missing a finger from an accident and you want children with the normal 5 digits, you can use heritability to work out that finding a wife with 6 fingers isn't going to get you back to normal.

Another one is if you are gay and you want gay children, the low heritability of sexual orientation indicates that finding a gay gamete donor won't help much.

On the other hand if you are short and want taller children, you might put a lot of effort into finding a wife who is taller than average (heritability of height tells you that this will work).

Load More