How is this better than just telling the human overseers "no, really, be conservative about implementing things that might go wrong."
Building a filter can be the human overseers’ response to this suggestion.
What makes a two-part architecture seem appealing?
An analysis of risk that is separate from an HCH system mediates quality in that system.
What does "epistemic dominance" look like in concrete terms here - what are the observables you want to dominate HCH relative to, wouldn't this be very expensive, how does this translate to buying you extra safety, etc?
The observables are actions of the system, and HCH dominates with respect to some action-guiding information which we’ve deduced from these observable actions.
Yes, it might be expensive.
We hope that with more information on its subject, a mediator of quality may better describe its risk. Better risk analysis translates to more safety.