you may be disappointed, unless you make 40+ predictions per week it will be hard to compare weekly drift, the Bernoulli distribution has a much higher variance compared to the normal distribution, so the uncertainty estimate of the calibration is correspondingly wide (high uncertainty of data -> high uncertainty of regression parameters). My post 3 will be a hierarchical model which may suite your needs better but it will maybe be a month before I get around to making that model.
If there are many people like you then we may try to make a hackish model that down weights older predictions as they are less predictive of your current calibration than newer predictions, but I will have to think long and hard to make than into a full Bayesian model, so I am making no promises
It almost means 3. It means the Vaccine Efficacy is 95%
VE is calculated this way:
where v are the number of sick people in the vaccine group and c is the number of sick people in the control group
So if 100 got sick in the control group and 5 in the vaccine group then:
So it's a 95% reduction in your probability of getting COVID :)
Note that the number reported is sometimes the mode and sometimes the mean of the distribution, but beta/binomial distributions are skewed so the mean is often lower than the mode. I have written a blogpost where I redo the Pfizer analysis
I have tried to add a paragraph about this, because I think it's a good point, and it's unlikely that you were the only one who got confused about this, Next weekend I will finish part 2 where I make a model that can track calibration independent of prediction, and in that model the 60% 61/100 will have a better posterior of the calibration parameter than then 60% 100/100, though the likelihood of the 100/100 will of course still be highest.
I have gotten 10 votes, the sum of which is 4, all of you guys who disliked the post can you please comment so I know why?
you mean the N'th root of 2 right?, which is what I called the null predictor and divided Scott predictions by in the code:
random_predictor = 0.5 ** len(y)
which is equivalent to 0.5N where N is the total number of predictions
You are absolutely right, any framework that punishes you for being right would be bad, my point is that increasing your calibration helps a surprising amount and is much more achievable than "just git good" which is required for improving prediction.
I will try to put your point into the draft when I am off work , thanks
Thanks, also thanks for pointing out that I had written p(θ∣y) a few places instead of $p(y\mid\theta), since everything is the bernoulli distribution I have changed everything to p
I have not been consistent with my probability notation, I sometimes use upper case P and sometimes lower case p, in future posts I will try to use the same notation as Andrew Gelman, which is Pr for things that are probabilities (numbers) such as Pr(y=1)=0.7 and p for distributions such as p∼N(0,2). However since this is my first post, I am afraid that 'editing it' will waist the moderators time as they will have to read it again to check for trolling, what is the proper course of action?