Jisk, formerly Jacob. (And when Jacobs are locally scarce, still Jacob.)
LW has gone downhill a lot from its early days and I disapprove of most of the moderation choices but I'm still, sometimes, here.
It should be possible to easily find me from the username I use here, though not vice versa, for interview reasons.
CAST is a great idea and seems like the most promising way forward with architectures similar to the ones we have, but I do not see any reason to believe we could, if we had a corrigibility meter, build an AI that implemented corrigibility with reasonable robustness within a year. Five years would probably be enough but at that point you're looking for at least one, and maybe 2-3, major insights.
In particular, it looks like we’re close enough to being able to implement corrigibility that the largest obstacle involves being able to observe how corrigible an AI is.
That's a wild claim to make without reference to specific papers or milestones. I'm not fully up on 'superalignment' progress but last I looked no one on the modern paradigm side was seriously attempting to study corrigibility, let alone making this kind of progress. And results like Golden Gate Claude and the 'buggy code -> evil' transformation indicated it was probably just as hard and unnatural vs in the MIRI paradigm.
In the scenario where he knows that prophecy foretells an outcome incompatible with his success here, his major decision point is long-past; he has reason to do it anyway (presumably prophetic reason). I see nothing he could do which is obviously better, and the conversation may itself be part of the keyhole future path.
If he doesn't, this is still far from an "approximately-worst option." It's still a really good trap unless Quirrell knows the Mirror is going to be the trap, knows Harry's Cloak is the genuine article that will still hide him from the Mirror, and can trick and coerce Harry into coming with him, which is three different things Dumbledore has good reason to think he probably doesn't know. The latter two are both achieved only through adventures Dumbledore doesn't know about - Azkaban, and Harry using up his time-loop password on the first day. As Lucius told Draco - any plan that relies on three things going right for you is at the limit of possible plans, and the real limit is two.
I still do not know what you think he should have done, either in the scenario where he knows he will fail due to prophecy, or the one where he does not.
I have two questions that come to mind, both about Chapter 86:
As many people have probably wondered since they read
"But as you can see, the Dark Lord was quite cunning." His gaze grew more distant. "Oh," Severus breathed, "he was very cunning indeed..."
and
Because I reread this scene a couple weeks ago, and it seemed like there were a bunch of reasons for him to find a way to elaborate on why he thought a true Dark Rationalist fighting a war against Dumbledore would win in hours rather than years, and his internal debate on this seems to be trending solidly in that direction, but then when he's going to reach for paper, the meeting is interrupted and he never gets back to it. Which made me think that Dumbledore had arranged things for prophecy's sake to prevent him, and consider why.
What is Dumbledore doing, that carries the Idiot Ball in the Mirror scene?
It begins with him having Quirrell hostage in an inescapable trap, proceeds through a conversation in which nothing of consequence occurs (nor is anything of consequence concealed), and ends with his assumption about the trap being invalidated by a cause nothing he could have done would have affected.
Or, with foreknowledge, it begins with the appearance of a trap, which Dumbledore knows will fail because he will fall very soon and Voldemort will not be beaten by his own hand. It proceeds through a conversation that is for show for both of them, possibly directed by prophecy because Harry is a hidden audience. And it ends where Dumbledore knows it must, without any action he could have taken to affect it.
Why does nobody ever ask about the Project Lawful / Planecrash epilogue? I still have to finish that one too!
Because we assume that you care more about HPMoR's epilogue, I suspect. Also some amount of 'nobody ever finishes promised epilogues for rationalfics', which has not proven to be a law of nature but is certainly a strong prior.
So much was given to you, and that’s all you managed?
The principal problem here is the expectations. I expect those to essentially vanish in a state of abundance.
Also, the description of a school as a utopia seems (a) false and (b) totally irrelevant to the interesting claims made about the effect of abundance.
I don't know that I've ever actually seen a show of any kind that actually felt over-rehearsed to me, but that may be a personal perception thing, and this does seem like a good strategy if you can swing it. Spreading practice out over three months probably 80/20s it.
Editing Essays into Solstice Speeches: Standing offer: if you have a speech to give at Solstice or other rationalist event, message me and I'll look at your script and/or video call you to critique your performance and help