Bayes-Up: An App for Sharing Bayesian-MCQ

Thanks for your recommendation! I have corrected the problem with the asymmetric distribution (now computing the whole distribution) and added a second graph showing exactly what you suggest and it looks good.

Unfortunately for the first approach that I implemented, the MAP is not always within a 90% confidence interval (It is outside of it when the MAP is 1 or 0). I agree that it is confusing and seems undesirable.

(You might need to hard-refresh the page if you want to see the update CTRL+SHIFT+R)

Bayes-Up: An App for Sharing Bayesian-MCQ

You are right about the proportion of dots within the error bars. This sounds like something I would want to change.

100% is not within the error bar, because they are not exactly error bars, but bayesian estimations of where your true probability lies using a uniform prior between 0% and 100%. If I pick a coin which has a probability p of Head picked uniformly between 0% and 100%, then after observing 4 Heads out of 4 throws, you should still believe in average that the probability of Head is 80% ( = n_heads / (n_throws + 1) ) in average and a 75% confidence interval would not contain the probability 100%.

So you need to show more proofs that your 100% answers are indeed right 100% of the time. I agree this is confusing, and I want to change it for the better, but I am unsure how.

For all answers with probability p, I count the number of times it has been the right answer and a wrong answer. If anyone as a recommendation on how to compute the top and bottom percentage of the error bars from these, I would really appreciate it.

Bayesian examination

I was waiting to make the app a bit better first. I made a post out of it today:

https://www.lesswrong.com/posts/7KRWCRBmvLhTMZB5Y/bayes-up-an-app-for-sharing-bayesian-mcq

Bayes-Up: An App for Sharing Bayesian-MCQ

Here you can see a graph of calibration of a user (available in the app):

https://twitter.com/le_science4all/status/1225498307348377600

And here you can see graphs of calibration for some of the quizzes of the app:

https://twitter.com/le_science4all/status/1225527782647705600

They clearly show overconfidence in the answers of the participants.

Bayesian examination

Since I read this post I have implemented this small app:

- Github: https://github.com/Stokastix/bayes-up
- Deployed at: https://bayes-up.web.app/
- Using MCQ from here: https://opentdb.com/

I make apps only as a hobby, so it is not bug-free, scalable, or great. Feel free to send advice, comments, or requests.

Several similar apps exists which all had to solve the difficulty of making a set of interesting questions. I could make a small list if you are interested.

It's because I changed it to only show estimations for probabilities which have received at least 4 answers and you have not yet answered enough questions. I am not confident that this change is good and I might revert it.