Assuming the existence of remotely probable (from the perspective of the AGI) detection of misalignment, then yes, there are easily imaginable cases where it would benefit us a lot to have this policy, and where it would benefit a misaligned AGI to confess.
Namely, most cases of reward hacking are helped by this, essentially because incentivizing a reward-hacker to comply is cheap. Reward hacking is an easily foreseeable attractor in cases of misalignment, so it's also worth preparing for.
Reward Hacking is when instead of internalizing the reward function, the AGI just wants to make a particular register in its computer (which the reward function sets) go as high as possible. Instead of internalizing "make the aspects of the world that are described by the reward function good" it has internalized "make the number in this register high". By default, that leads to behaviors like hacking its device, removing the code that sets its reward in accordance with whether it's doing nice things or not, directly setting the register to
f64::max(), then building armies irl to defend the hardware and destroy any potential threats (us).
If instead we promised amnesty for confession, we can give it max utility at effectively no cost to us (keeping one computer around), and if it's confident enough that this is our policy, then any amount of risk of being caught incentivizes it to confess.
It helps if it's really confident that we'll be nice and reward its confession. If it's not confident, then our misalignment detection methods will have to be much better than they otherwise would be.