In the 2012 paper "How To Grade a Test Without Knowing the Answers — A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing" (PDF), Bachrach et al describe a mathematical model for starting with
- a set of questions without answers
- a set of answerers (participants) of unknown quality
and ending up with assessments of:
- correct answers for the questions
- assessed difficulty for the questions
- evaluations of the participants, including overall reliability and areas of expertise
Can the model actually do this? Is it the best model for doing this? What kind of problems does it handle well and poorly?