Here is my best attempt at a delaying tactic, after sleeping on it. Please tear apart/suggest better ways in which LV might tear apart, to replace the poor placeholder responses he has here.
--
"Agree that I musst die, if it ssavess world. But thiss iss not besst way to kill me. Ssee how you can benefit more, given your goalss."
"Explain."
"Believe power you know not doess refer to power to desstroy life-eaterss. Life-eaterss will find you eventually, teacher. Know you. Will hunt you down, ssomeday. Eat all of you, all of world and magic, in the end."
"Sso you will give that magic to me, now."
"You can never reach needed sstate of mind - incompatible with deadly indifference. Sschoolmasster could never casst - incompatible with acceptance of death. Majority cannot casst, and in the tessting, sstandard defence againsst life-eaterss iss ssacrificed. Will weaken your alliess greatly, should I randomly try to teach."
"What do you proposse, then?"
"Take me to life-eater prisson. Allow me to pour out my life and magic there, eradicate them wholly. How I wisshed to do sso, during the resscue! You called me back, then."
".....
Really like that one. My first reaction was "and yet the Gatekeeper can still say no and kill you". After all, Voldemort's trying to prevent untold destruction, a prophecy whose exact paths to possible fulfilment are a mystery. Killing a limited number of Dementors is less important.
But my understanding of the AI box experiment is that it was never just about finding an argument that will look persuasive to someone armchair-thinking about it. It's about finding an opening to the psyche, an emotional vulnerability specific to your current target. Voldemort doesn't seem to have a lot of those, but we do have this:
...Harry asked his dark side what it thought of death.
And Harry's Patronus wavered, dimmed, almost went out upon the instant, for that desperate, sobbing, screaming terror, an unutterable fear that would do anything not to die, throw everything aside not to die, that couldn't think straight or feel straight in the presence of that absolute horror, that couldn't look into the abyss of nonexistence any more than it could have stared straight into the Sun, a blind terrified thing that only wanted to find a dark corner and hide and not have to think about it any more -
(EDIT: I thought about this some more, produced a better solution, submitted it as a review, and also posted it here)
Voldemort knows that Harry possesses an altered version of the Patronus spell, and something else that affects dementors, but not the spell's nature. Harry can buy time by offering to explain how these work, by doing so, and negotiating for more names. Harry can truthfully say that learning certain things about the Patronus and about Dementors will have side-effects that Voldemort may not want, and that this means he needs to think about what to say. Voldemort will surely exert time pressure, but can only speed this up by so much if he wants to fulfill his goal of gaining all of Harry's secret powers. He can also try to convince Voldemort to let him cast his modified patronus; this is very unlikely to work, but should be done anyways because seeming to not try any tricks would itself be suspicious.
A winning strategy should simultaneously disable all of the death eaters present and Voldemort himself. Voldemort will be disabled if their magics touch, especially if that touch can be sustained.
The main weapon at Harry's disposal is partial transfiguration. I spoke
I wrote a version of this up at reddit too, but it seems to me trying to hack the laws of physics is wasted effort when we know very little about how magic works in concrete terms. We don't know what Harry can really do, how fast he can do it, or whether Voldemort would notice.
What we do know are: how Harry thinks how Eliezer thinks * what Voldemort wants
So we should be looking at things Harry could say that would advance his goal of surviving rather than trying to come up with a combination of spells, with the understanding that winning ideas are probably going to cluster around narrative interventions that EY thinks are interesting or important. A few that spring to mind:
Memetic hazard: are there things Harry could say or bring to Voledmort's attention that would pose an existential risk to him if he harms Harry
Let the AI out of the box: is there something Harry can offer Voldemort such that Voldemort goes against his stated agenda
Precommitment / timeless decision theory: are there ways Harry can manipulate the unbreakable vow to force certain conditions in the future
Learning to lose: what if Harry surrenders and agrees to join Voldemort, with a commitment Voldemort finds conv...
Harry hisses "You have missinterpreted prophecy, to your great peril, becausse of power I have, but you know not. Yess, you are sstudying sscience, but, honesstly, you are yearss behind me. It may be that thiss power you know not iss ssomething I have at thiss sspecific time, that you will not know for too many yearss hence.
Before I explain, remember my Vow, and know my honesst intention not to desstroy the world, Vow or no. Now, do you know why I would tear apart the very sstarss? Do you know how? Not to desstroy the world, but to ssave it from whatever threatss require more energy to extinguissh than exisstss in thiss entire ssolar ssystem. There are more thingss in heaven and earth, Dark Lord, than are dreamt of in your philossophy.
I would usse sstar lifting to do it ssafely. In a way, I really would end the world to ssave it, ssince once humanss are out of the cradle, sspread through... er, let uss ssay 'heaven' in Parsseltongue, to mean well beyond thiss planet, why not add the masss of the Earth itsself to the sstuff of the sstarss, to yield that much more energy? And sso, if you avert thiss prophecy, there iss sseriouss rissk you doom yoursself! Are you willing to take...
I posted a longer form of this as a review / solution. Here's a condensed version:
Partial Transfiguration works through a deep understanding of physics. It allows Harry to to create any physically valid state of the universe, as long as he can hold it in his mind.
What this means is that you don't need to Transfigure a gun in order to fire a bullet. You can just Transfigure a bullet in the state of having been fired.
This is what the ability to Transfigure any physically valid configuration really means. You don't need to make a bulky laser weapon. Just make a laser pulse: an arbitrary amount of high-energy photons, aimed in the right direction. Instead of a shaped explosive charge, make a shaped explosion. Instead of antimatter, make gamma rays. Instead of a black hole, dangerous to everybody near it, make a bunch of gravitons and aim them at your enemy.
So given all that, how should Harry kill his enemies?
Lasers are messy weapons. Even black robes are reflective in some wavelengths. Use too much energy and you'll get a fireball back in your face. Release the energy too quickly and it will create an explosion instead of steadily boiling away your target.
Kinetic energy is safer. Tra...
If you can think of any trick that I have missed in being sure that Harry Potter's threat is ended, speak now and I shall reward you handsomely... speak now, in Merlin's name!"
Voldemort forgot a very basic ”trick”: disarming Harry first.
At the end of chapter 112, we wondered about that, too. It turns out that Harry needed to have the wand to perform the vow. With that out of the way … why does Harry still have his wand? Is this just because Eliezer wants to make sure that Harry still has a way out? Or is there some in-universe reason for Voldemort to allow this?
This is a new thread to discuss Eliezer Yudkowsky’s Harry Potter and the Methods of Rationality and anything related to it. This thread is intended for discussing chapter 113.
There is a site dedicated to the story at hpmor.com, which is now the place to go to find the authors notes and all sorts of other goodies. AdeleneDawner has kept an archive of Author’s Notes. (This goes up to the notes for chapter 76, and is now not updating. The authors notes from chapter 77 onwards are on hpmor.com.)
IMPORTANT -- From the end of chapter 113: