What does it mean to raise the temperature by a fixed amount before we've defined temperature?
Without having read the book, or knowing the history of temperature, an obvious property I'd want my scale of temperature to have is that the temperature of a mixture of identical substances at different temperatures is the weighted average of the constituent parts.
Then you can easily recreate the celcius scale. Let the temperature of freezing water be 0, and the temperature of boiling water be 100.
Mix 99 parts freezing water with one part boiling water, and mix well (and quickly, in an insulated environment). That's 1 degree celcius.
Repeat for the other 98 combinations.
If we're lucky the mercury in our thermometer moves up a constant amount each time, and we can avoid mixing things to calibrate our thermometer in the future. If we're unlucky it doesn't and we need to painstakingly calibrate all thermometers. If we're really unlucky a scale that works on one substance doesn't work on another and then we need to rethink this whole temperature thing again.
Is this standard for the MBTI?
The test we used did, I have very little further knowledge of the MBTI other than what was discussed on this course.
That assumes you're in a better position to make such investments than the charity?
It's a good metaphor, but I think one important aspect this misses is overfitting - when you have a lot of parameters the NN can literally memorise small training sets till it gets 100% on the training set and 0% on the test set. Whereas a smaller model is forced to learn the underlying structure, so generalises better.
Hence larger models need a much larger training set even to match smaller models, which is another disadvantage of larger models (besides for higher per token training and inference costs).
Ok, in that case you're just basically referring to the SSA Vs SIA. That's an old chestnut, and either way leads to seemingly paradoxical results.
There is no other deck of cards here. There's no copy of me to compare myself to, and say how curious that looks exactly like me.
That's like saying that every game of cards must be rigged, because otherwise the chance of having this particular card order is miniscule...
You give as an example a situation which is inherently not repeatable, where were forced to make do with reasoning under significant uncertainty and with very limited information, to decide what's going on out of an incredibly wide hypothesis space. You correctly point out this is hard.
You then say that in a situation where we can perform repeated experiments to exclude one hypothesis P-values work ok.
But in that exact situation Bayesian reasoning works fine. Sure you might not agree on which alternative hypothesis is true but so long as both of you agree there are any alternative hypothesese that make it more likely to see the given results, after a few rounds you'll have extremely low credence in the original hypothesis.