Does anyone have a clean, mathematical definition of ontology mismatch which situates the problem in the statistical learning theory? 

There's this post on the alignment forum where from what I can understand ontology identification is defined as a classifier which has a conservative decision boundary given some questions and an error-free dataset, but it'd be nice to have something cleaner? 

New to LessWrong?

New Answer
New Comment