Thanks Yoav — this is an important post, and not just for the AI alignment community. What strikes me most is how CDV is becoming a connective methodology across domains that rarely talk to each other: chip verification, AV verification, robot SOTIF, and now AI alignment. Underneath all of them sits the same question safety people keep coming back to: how do you know what you don't know? An explicit, evolving coverage map is the most practical answer I've seen for making that question tractable.
Coming from the SOTIF side (ISO 21448, scenario-based safety e... (read more)
Thanks Yoav — this is an important post, and not just for the AI alignment community. What strikes me most is how CDV is becoming a connective methodology across domains that rarely talk to each other: chip verification, AV verification, robot SOTIF, and now AI alignment. Underneath all of them sits the same question safety people keep coming back to: how do you know what you don't know? An explicit, evolving coverage map is the most practical answer I've seen for making that question tractable.
Coming from the SOTIF side (ISO 21448, scenario-based safety e... (read more)