A little criticism. There is a problem: very little is said about external sources of knowledge.
In the chapter "Playing Double Crux" the reader's attention is focused on the beliefs of the opponents (find the belief "B" on which the truth of "A" depends). Before that, it was said, that you don't need to hide your own belief structure. And in general, throughout the text, a lot of attention is focused on the beliefs, that "exists in the head" of the participants, and they only need, to be obtained from there (..and if you're lucky, it will be the desired judgment "B", which both participants will call the "root of the disagreement", and will be able to verify in practice).
Only at the end of the chapter, a few words are written: "..potentially learn from the other (or from the world/research/experiments)".
Now I will tell you what negative consequences, such focusing has led to, in my experience with several rationalists, who practice D.C.:
We discussed a topic, that is close in meaning to the evaluation of the actions of a surgeon in a hospital, who cuts a patient. Our disagreement: Is the surgeon cutting correctly? (notice: participants do not know medical science)
It seems, that it would be rational to study books on surgery & medicine, take a course at the university, or at least watch a video on YouTube. But, since this methodology did not pay enough attention to the need to deepen their knowledge of the real world (surgery), the participants unsuccessfully tried to "reinvent the wheel", and find testable beliefs "B", somewhere on the surface, and among their existing knowledge.. It's very much like a "cargo cult", where a tribe tries to build an airplane out of sticks, using only their knowledge, and no intention of reading an aerodynamics (or surgery) textbook.
CONCLUSION: I am sure it would be helpful, if the authors and readers, take this point into account, and add focus to the study of the real world: Look for the argument "B" in books, and from specialists, and not reinvent the wheel in the course of the dispute.
There is an unresolved problem related to the vague meaning of the term "alignment".
Until we clarify what exactly we are aiming for, the problem will remain. That is, the problem is more in the words we use, than in anything material. This is like the problem with the term "freedom": there is no freedom in the material world, but there are concrete options: freedom of movement, free fall, etc.
For example, when we talk about “alignment”, we mean some kind of human goals. But a person often does not know his goals. And humanity certainly does not know its goals (if we are talking about the meaning of life). And if we are not talking about the meaning of life and “saving Soul”, then let’s simplify the task, and when we mention “alignment”, we will mean saving the human body. AI can help save a person’s life, if it does not slip him a poison recipe (this is a trivial check, and there seems to be no “alignment” problem here. Modern LLMs checking that).
But if we understand “alignment” in such a vulgar sense, then, there will be those who will see the problem, that the AI does not help them “saving Soul”, or something similar (humans can have an infinite number of abstract goals that they are not even able to formulate it).
Before checking “alignment” we need to, at least, be able to accurately "formulate goals". As I see it, we (most people) are not yet capable of.