In the 2012 paper "How To Grade a Test Without Knowing the Answers — A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing" (PDF), Bachrach *et al* describe a mathematical model for starting with

- a set of questions without answers
- a set of answerers (participants) of unknown quality

and ending up with assessments of:

- correct answers for the questions
- assessed difficulty for the questions
- evaluations of the participants, including overall reliability and areas of expertise

Can the model actually do this? Is it the best model for doing this? What kind of problems does it handle well and poorly?