I bet this is mostly a training data limitation.
Someone at Google allegedly explicitly said that there wasn't any possible evidence which would cause them to investigate the sentience of the AI.
I don't think human level AIs are safe, but I also think it's pretty clear they're not so dangerous that it's impossible to use them without destroying the world. We can probably prevent them from being able to modify themselves, if we are sufficiently careful.
"A human level AI will recursively self improve to superintelligence if we let it" isn't really that solid an argument here, I think.
I don't think it is completely inconceivable that Google could make an AI which is surprisingly close to a human in a lot of ways, but it's pretty unlikely.
But I don't think an AI claiming to be sentient is very much evidence: it can easily do that even if it is not.
Even if it takes years, the "make another AGI to fight them" step would... require solving the alignment problem? So it would just give us some more time, and probably not nearly enough time.
We could shut off the internet/all our computers during those years. That would work fine.
So you think that, since morals are subjective, there is no reason to try to make an effort to control what happens after the singularity? I really don't see how that follows.
I don't understand precisely what question you're asking. I think it's unlikely we will happen to solve alignment by any method in the time frame between an AGI going substantially superhuman and the AGI causing doom.
Eliezer's argument from the recent post:
The reason why nobody in this community has successfully named a 'pivotal weak act' where you do something weak enough with an AGI to be passively safe, but powerful enough to prevent any other AGI from destroying the world a year later - and yet also we can't just go do that right now and need to wait on AI - is that nothing like that exists. There's no reason why it should exist. There is not some elaborate clever reason why it exists but nobody can see it. It takes a lot of power to do something to the current world that prevents any other AGI from coming into existence; nothing which can do that is passively safe in virtue of its weakness. If you can't solve the problem right now (which you can't, because you're opposed to other actors who don't want to be solved and those actors are on roughly the same level as you) then you are resorting to some cognitive system that can do things you could not figure out how to do yourself, that you were not close to figuring out because you are not close to being able to, for example, burn all GPUs. Burning all GPUs would actually stop Facebook AI Research from destroying the world six months later; weaksauce Overton-abiding stuff about 'improving public epistemology by setting GPT-4 loose on Twitter to provide scientifically literate arguments about everything' will be cool but will not actually prevent Facebook AI Research from destroying the world six months later, or some eager open-source collaborative from destroying the world a year later if you manage to stop FAIR specifically. There are no pivotal weak acts.
So do you think that instead we should just be trying to not make an AGI at all?
I think it is very unlikely that they need so much time as to make it viable to solve AI Alignment by then.
Edit: Looking at the rest of the comments, it seems to me like you're under the (false, I think) impression that people are confident a superintelligence wins instantly? Its plan will likely take time to execute. Just not any more time than necessary. Days or weeks, it's pretty hard to say, but not years.