Recently I saw that Hypermind is offering a prediction market on which threshold BTC will hit first: $40k or $60k? You can find the market on this list.
I find this funny because for this kind of question it's going to be a very good approximation to assume BTC is a martingale, and then the optional stopping theorem gives the answer to the question immediately: if BTC is currently priced at $40k < X < $60k then the probability of hitting $40k first is ($60k - X)/($20k). Since BTC itself is going to be priced much more efficiently than this small volum... (read more)
You need "if the number on this device looks to me like the one predicted by theory, then the theory is right" just like you need "if I run billion experiments and frequency looks to me like probability predicted by the theory, then the theory is right".
You can say that you're trying to solve a "downward modeling problem" when you try to link any kind of theory you have to the real world. The point of the question is that in some cases the solution to this problem is more clear to us than in others, and in the probabilistic case we seem to be using some... (read more)
Thank you for the link. I'm curious what the table would look like if we examined the top 10 or 20 cities instead of just those tied for the top position.
I think this is quite a tall order for ancient times, but a source I've found useful is this video by Ollie Bye on YouTube. It's possible to move his estimates around by factors of 2 or so at various points, but I think they are correct when it comes to the order of magnitude of historical city populations.
Who does "they" refer to in this sentence? It could mean two very different things.
Edited the... (read more)
I'd be happy to be corrected if I'm wrong. Do you have more precise numbers?
There's obviously quite a bit of uncertainty when it comes to ancient city populations, but Wikipedia has a nice aggregation of three different sources which list the largest city in the world at various times in history. Estimates of city populations can vary by a factor of 2 or more across different sources, but the overall picture is that sometimes the largest city in the world was Chinese and sometimes it was not.
... (read more)My reference point for technological regression after the fa
This post seems to be riddled with inaccuracies and misleading statements. I'll just name a few here, since documenting all of them would take more time than I'm willing to spare.
For most of history, China was the center of civilization. It had the biggest cities, the most complex government, the highest quality manufacturing, the most industrial capacity, the most advanced technology, the best historical records and the largest armies. It dominated East Asia at the center of an elaborate tribute system for a thousand years.
This is simply false. China ... (read more)
It's true that both of these outcomes have a small chance of not-happening. But with enough samples, the outcome can be treated for all intents and purposes as a certainty.
I agree with this in practice, but the question is philosophical in nature and this move doesn't really help you get past the "firewall" between probabilistic and non-probabilistic claims at all. If you don't already have a prior reason to care about probabilities, results like the law of large numbers or the central limit theorem can't convince you to care about it because they are a... (read more)
So I agree with most of what you say here, and as a Metaculus user I have some sympathy for trying to make proper scoring rules the epistemological basis of "probability-speak". There are some problems with it, like different proper scoring rules give different incentives to people when it comes to distributing finite resources across many questions to acquire info about them, but broadly I think the norm of scoring models (or even individual forecasters) by their Brier score or log score and trying to maximize your own score is a good norm.
There are proba... (read more)
Negations of finitely observable predicates are typically not finitely observable. [0,0.5) is finitely observable as a subset of [0,1], because if the true value is in [0,0.5) then there necessarily exists a finite precision with which we can know that. But its negation, [0.5,1], is not finitely observable, because I'd the true value is exactly 0.5, no finite-precision measurement can establish with certainty that the value is in [0.5,1], even though it is.
Ah, I didn't realize that's what you mean by "finitely observable" - something like "if the propos... (read more)
What I'm sneaking in is that both the σ-algebra structure and the topological structure on a scientifically meaningful space ought to be generated by the (finitely) observable predicates. In my experience, this prescription doesn't contradict with standard examples, and situations to which it's "difficult to generalize" feel confused and/or pathological until this is sorted out.
It's not clear to me how finitely observable predicates would generate a topology. For a sigma algebra it's straightforward to do the generation because they are closed under com... (read more)
I think you can justify probability assessments in some situations using Dutch book style arguments combined with the situation itself having some kind of symmetry which the measure must be invariant under, but this kind of argument doesn't generalize to any kind of messy real world situation in which you have to make a forecast on something, and it still doesn't give some "physical interpretation" to the probabilities beyond "if you make bets then your odds have to form a probability measure, and they better respect the symmetries of the physical theory y... (read more)
Believing in the probabilistic theory of quantum mechanics means we expect to see the same distribution of photon hits in real life.
No it doesn't! That's the whole point of my question. "Believing the probabilistic theory of quantum mechanics" means you expect to see the same distribution of photon hits with a very high probability (say ), but if you have not justified what the connection of probabilities to real world outcomes is to begin with, that doesn't help us. Probabilistic claims just form a closed graph of reference in which they only refer ... (read more)
You spend a few paragraphs puzzling about how a probabilistic theory could be falsified. As you say, observing an event in a null set or a meagre set does not do the trick. But observing an event which is disjoint from the support of the theory's measure does falsify it. Support is a very deep concept; see this category-theoretic treatise that builds up to it.
You can add that as an additional axiom to some theory, sure. It's not clear to me why that is the correct notion to have, especially since you're adding some extra information about the topology o... (read more)
See my response to a similar comment below.
Here I'm using "Bayesian" as an adjective which refers to a particular interpretation of the probability calculus, namely one where agents have credences about an event and they are supposed to set those credences equal to the "physical probabilities" coming from the theory and then make decisions according to that. It's not the mere acceptance of Bayes' rule that makes someone a Bayesian - Bayes' rule is a theorem so no matter how you interpret the probability calculus you're going to believe in it.
With this sense of "Bayesian", the epistemic content adde... (read more)
The question is about the apparently epiphenomenal status of the probability measure and how to reconcile that with the probability measure actually adding information content to the theory. This answer is obviously "true", but it doesn't actually address my question.
This is not true. You can have a model of thermodynamics that is statistical in nature and so has this property, but thermodynamics itself doesn't tell you what entropy is, and the second law is formulated deterministically.
As I see it, probability is essentially just a measure of our ignorance, or the ignorance of any model that's used to make predictions. An event with a probability of 0.5 implies that in half of all situations where I have information indistinguishable from the information I have now, this event will occur; in the other half of all such indistinguishable situations, it won't happen.
Here I think you're mixing two different approaches. One is the Bayesian apporach: it comes down to saying probabilistic theories are normative. The question is how to reconc... (read more)
I don't know what you mean here. One of my goals is to get a better answer to this question than what I'm currently able to give, so by definition getting such an answer would "help me achieve my goals". If you mean something less trivial than that, well, it also doesn't help me to achieve my goals to know if the Riemann hypothesis is true or false, but RH is nevertheless one of the most interesting questions I know of and definitely worth wondering about.
I can't know how an answer I don't know about would impact my beliefs or behavior, but my guess is tha... (read more)
To elaborate on the information acquisition cost point; small pieces of information won't be worth tying up a big amount of capital for.
If you have a company worth $1 billion and you have very good insider info that a project of theirs that the market implicitly values at $10 million is going to flop, if the only way you can express that opinion is to short the stock of the whole company that's likely not even worth it. Even with 10% margin you'd be at best making a 10% return on capital over the time horizon that the market figures out the project is bad ... (read more)
Ah, I see. I missed that part of the post for some reason.
In this setup the update you're doing is fine, but I think measuring the evidence for the hypothesis in terms of "bits" can still mislead people here. You've tuned your example so that the likelihood ratio is equal to two and there are only two possible outcomes, while in general there's no reason for those two values to be equal.
This is a rather pedantic remark that doesn't have much relevance to the primary content of the post (EDIT: it's also based on a misunderstanding of what the post is actually doing - I missed that an explicit prior is specified which invalidates the concern raised here), but
... (read more)If such a coin is flipped ten times by someone who doesn't make literally false statements, who then reports that the 4th, 6th, and 9th flips came up Heads, then the update to our beliefs about the coin depends on what algorithm the not-lying[1] reporter used to decide to report those
Yeah, Neyman's proof of Laplace's version of the rule of succession is nice. The reason I think this kind of approach can't give the full strength of the conjugate prior approach is that I think there's a kind of "irreducible complexity" to computing for non-integer values of . The only easy proof I know goes through the connection to the gamma function. If you stick only to integer values there are easier ways of doing the computation, and the linearity of expectation argument given by Neyman is one way to do it.
One concrete example of the ru... (read more)
Answering your questions in order:
What matters is that it's something you can invest in. Choosing the S&P 500 is not really that important in particular. There doesn't have to be a single company whose stock is perfectly correlated with the S&P 500 (though nowadays we have ETFs which more or less serve this purpose) - you can simply create your own value-weighted stock index and rebalance it on a daily or weekly basis to adjust for the changing weights over time, and nothing will change about the main arguments. This is actually what the authors
Over 20 years that's possible (and I think it's in fact true), but the paper I cite in the post gives some data which makes it unlikely that the whole past record is outperformance. It's hard to square 150 years of over 6% mean annual equity premium with 20% annual standard deviation with the idea that the true stock return is actually the same as the return on T-bills. The "true" premium might be lower than 6% but not by too much, and we're still left with more or less the same puzzle even if we assume that.
That's alright, it's partly on me for not being clear enough in my original comment.
I think information aggregation from different experts is in general a nontrivial and context-dependent problem. If you're trying to actually add up different forecasts to obtain some composite result it's probably better to average probabilities; but aside from my toy model in the original comment, "field data" from Metaculus also backs up the idea that on single binary questions median forecasts or log odds average consistently beats probability averages.
I agree with Simo... (read more)
I don't know what you're talking about here. You don't need any nonlinear functions to recover the probability. The probability implied by is just , and the probability you should forecast having seen is therefore
since is a martingale.
I think you don't really understand what my example is doing. is not a Brownian motion and its increments are not Gaussian; it's a nonlinear transform of a drift-diffusion process by a sigmoid which takes valu... (read more)
Thanks for the comment - I'm glad people don't take what I said at face value, since it's often not correct...
What I actually maximized is (something like, though not quite) the expected value of the logarithm of the return, i.e. what you'd do if you used the Kelly criterion. This is the correct way to maximize long-run expected returns, but it's not the same thing as maximizing expected returns over any given time horizon.
My computation of is correct, but the problem comes in elsewhere. Obviously if your goal is to just maximize ex... (read more)
The experts in my model are designed to be perfectly calibrated. What do you mean by "they are overconfident"?
I did a Monte Carlo simulation for this on my own whose Python script you can find on Pastebin.
Consider the following model: there is a bounded martingale taking values in and with initial value . The exact process I considered was a Brownian motion-like model for the log odds combined with some bias coming from Ito's lemma to make the sigmoid transformed process into a martingale. This process goes on until some time T and then the event is resolved according to the probability implied by . You have n "experts"... (read more)
NOTE: Don't believe everything I said in this comment! I elaborate on some of the problems with it in the responses, but I'm leaving this original comment up because I think it's instructive even though it's not correct.
There is a theoretical account for why portfolios leveraged beyond a certain point would have poor returns even if prices follow a random process with (almost surely) continuous sample paths: leverage decay. If you could continuously rebalance a leveraged portfolio this would not be an issue, but if you can't do that then leverage exhibits ... (read more)
I think there's some kind of miscommunication going on here, because I think what you're saying is trivially wrong while you seem convinced that it's correct despite knowing about my point of view.
No it doesn't. It weighs them by price (i.e. marginal utility = production opportunity cost) at the quantities consumed. That is not a good proxy for how important they actually were to consumers.
Yes it is - on the margin. You can't hope for it to be globally good because of the argument I gave, but locally of course you can, that's what marginal utility means! T... (read more)
Strong upvote for the comment. I think the situation is even worse than what you say: the fact is that had Petrov simply reported the inaccurate information in his possession up the chain of command as he was being pressured to do by his own subordinates, nobody would have heard of his name and nobody would have blamed him for doing his job. He could have even informed his superiors of his personal opinion that the information he was passing to them was inaccurate and left them to make the final decision about what to do. Not only would he have not been bl... (read more)
The reason I bring up the weighting of GDP growth is that there are some "revolutions" which are irrelevant and some "revolutions" which are relevant from whatever perspective you're judging "craziness". In particular, it's absurd to think that the year 2058 will be crazy because suddenly people will be able to drink wine manufactured in the year 2058 at a low cost.
Consider this claim from your post:
... (read more)When we see slow, mostly-steady real GDP growth curves, that mostly tells us about the slow and steady increase in production of things which haven’t been revo
In addition, I'm confused about how you can agree with both my comment and your post at the same time. You explicitly say, for example, that
Also, "GDP (as it's actually calculated) measures production growth in the least-revolutionized goods" still seems like basically the right intuitive model over long times and large changes, and the "takeaways" in the post still seem correct.
but this is not what GDP does. In the toy model I gave, real GDP growth perfectly captures increases in utility; and in other models where it fails to do so the problem is not that... (read more)
I think in this case omitting the discussion about equivalence under monotonic transformations leads people in the direction of macroeconomic alchemy - they try to squeeze information about welfare from relative prices and quantities even though it's actually impossible to do it.
The correct way to think about this is probably to use von Neumann's approach to expected utility: pick three times in history, say ; assume that where is the utility of living around time and ask people fo... (read more)
There is a standard reason why real GDP growth is defined the way it is: it works locally in time and that's really the best you can ask for from this kind of measure. If you have an agent with utility function defined over goods with no explicit time dependence, you can express the derivative of utility with respect to time as
If you divide both sides by the marginal utility of some good taken as the numeraire, say the first one, then you get
where is the pri... (read more)
It's not the logarithm of the BTC price that is a martingale, it's the BTC price itself, under a risk-neutral measure if that makes you more comfortable (since BTC derivatives would be priced by the same risk-neutral measure pricing BTC itself).