All of rossry's Comments + Replies

I don't know. (As above, "When [users] tell you exactly what they think is wrong and how to fix it, they are almost always wrong.")

A scoring rule that's proper in linear space (as you say, "scored on how close their number is to the actual number") would accomplish this -- either for scoring point estimates, or distributions. I don't think it's possible to extract an expected value from a confidence interval that covers orders of magnitude, so I expect that would work less well.

I think this argument doesn't follow:

There is hardly any difference between taking a life and not preventing a death. The end result is mostly the same. Thus, I should save the lives of as many humans as I can.

While "the end result is mostly the same" is natural to argue in terms of moral-consequentialist motivations, this AI only cares about [not killing humans] instrumentally. So what matters is what humans will think about [taking a life] versus [not preventing a death]. And there, there's a huge difference!

  1. Agree that causing deaths that are attri
... (read more)

Agreed on all points!

In particular, I don't have any disagreement with the way the epistemic aggregation is being done; I just think there's something suboptimal in the way the headline number (in this case, for a count-the-number-of-humans domain) is chosen and reported. And I worry that the median-ing leads to easily misinterpreted data.

For example, if a question asked "How many people are going to die from unaligned AI?", and the community's true belief was "40% to be everyone and 60% to be one person", and that was reported as "the Metaculus community ... (read more)

Not taking extrapolation far enough!

4 hours ago, your expected value of a point was $0. In an hour, it increased to $0.2, implying a ~20% chance it pays $1 (plus some other possibilities). By midnight, extrapolated expected value will be $4.19, implying a ~100% chance to pay $1, plus ~45% chance that they'll make good on the suggestion of paying $10 instead, plus some other possibilities...

I'm confused why these would be described as "challenge" RCTs, and worry that the term will create broader confusion in the movement to support challenge trials for disease. In the usual clinical context, the word "challenge" in "human challenge trial" refers to the step of introducing the "challenge" of a bad thing (e.g., an infectious agent) to the subject, to see if the treatment protects them from it. I don't know what a "challenge" trial testing the effects of veganism looks like?

(I'm generally positive on the idea of trialing more things; my confusion+comment is just restricted to the naming being proposed here.)

Thanks, I agree with this and it's probably not good branding anyway. I was thinking the "challenge" was just doing the intervention (e.g. being vegan), but agree that the framing is confusing since it refers to something different in the clinical context. I will edit my shortforms to reflect this updated view.

Oh, yeah, I can't vouch for / walk through the operations side (not having done it myself). I have had the misfortune of looking at ways to get a Medallion certification outside the US, and it's not pretty (I failed).

I don't know.

It's worth noting that the terms are (intentionally?) designed to be generous and a pay over market-rate. I suspect this is a feature, not a bug, and the terms are intended to be a mild subsidy to the buyer. The US has many policies that subsidize US citizens and exclude non-Americans; this one doesn't stand out to me as being particularly unusual. (I don't particularly endorse this policy posture, but I note that it exists.)

Brainstorming a bit, it seems plausible that the program could include non-citizen taxpayers. If it was truly open to non-taxpayers, then it would amount to a subsidy of non-residents with citizen+resident tax dollars, which the US government is mostly opposed to, as policy.

I don't know.

It's worth noting that the terms are (intentionally?) designed to be generous and a pay over market-rate, and this would be harder to do if there were a $10mln/person/year cap and most of the benefit would flow to wealthy Americans who can finance their debt investments with secured borrowing at scale.

If I had to guess, I'd note that this is pretty similar to contribution/person/yr caps on other savings methods the government subsidizes, eg, with tax advantages -- 401(k) accounts and IRA accounts. Insofar as it's primarily a social scheme to i... (read more)

I'm sorry -- I don't understand how your comment responds to mine. I pointed to the fact that Omicron outcompeting Delta without being descended from Delta indicated that a successor to Omicron could perhaps not be descended from Omicron. In particular, I agree with you that Omicron will become the dominant variant almost everywhere.

One minor detail: It is implausible that Omicron's competitive advantage is primarily derived from an increased R0 (that would give it a higher R0 than measles); rather, its observed fitness against the competition is more easily explained by some measure of immunity evasion (which won't be measured in increased R0).

Omicron will soon become the dominant variant almost everywhere, so subsequent variants will probably branch off it.

I don't think you're wrong, but it is worth noting that Omicron itself violated this guess; it is defended from the original strain, not any other Greek-lettered variant.

But note the R0 for Omicron. It seems to be able blow the competition away in any location it establishes.

It is pronounced identically to the adjective "new".

I give the WHO kudos for picking omicron instead of nu. (Actually, I'm pretty shocked that they did something this common-sensical, and notice that I am surprised.) I spent Friday morning (= Thursday evening, US time) talking out loud with colleagues about the new nu variant and after like two attempts to clarify what the f--- I actually meant, multiple people independently joked that it was so bad we should just skip nu and go to omicron.

If you've only ever discussed it in text, you're underestimating how bad it is to use "nu" as an adjective in verbal conversation.

Well yes, I think that would be implied by the shock
3Raphaël S1y
Just curious, what's the problem with "nu" in verbal conversation ?

Both were sent to the hospital but it is unclear whether this was part of a standard procedure or if they were ill enough to need to go.

Testing positive was sufficient to get them sent to the hospital, and they had mandatory PCR testing every ~3 days; this is no evidence about their symptoms.

(I recently went through HK arrival quarantine -- in the same hotel, no less -- and researched the operating procedure runbook out of personal interest.)

Internet is a great place. Thanks for that info!

I think I was unclear. I meant that if you did correctly estimate the number of cases, you'd need mamy times that many courses of medicine "in the system" to make sure that no one worried about running out in their part of the system, so that no one started hoarding where they were. I estimated that about ten times as many cases as you natively needed would about do it.

If our standard is non-scarcity for prophylactic prescription for close contacts, then 10x the expected number of close contacts in your "part of the system"...

(To be clear, this is just a statement about hoarding/availability dynamics, not about when "things should go back to normal".)

On the other hand, it could be handy to use your more stringent definition of non-scarcity whenever I need a reason not to do something. If you feel like scribbling some calculations to back I that up, I'll be forever in your debt
So I guess my task is to adjust my definition of non-scarcity "for me"? I don't want set the bar too high for myself! Alternatively this could be more of a Potter Stewart situation - I'll know it when I see it.

Right, I agree that for the update aggregation is better than (but still lossy). And the thing that affects is the weighting in the average -- so if then the s don't matter! (which is a possible answer to your question of "how much aggregation/disaggregation can you do?")

But yeah if is very different from then I don't think there's any way around it, because the effective could be one or the other depending on what the are.

The framing of this issue that makes the most sense to me is " is a function of ".

When I look at it this way, I disagree with the claim (in "Mennen's ABC example") that "[Bayesian updating] is not invariant when we aggregate outcomes" -- I think it's clearer to say the Bayesian updating is not well-defined when we aggregate outcomes.

Additionally, in "Interpreting Bayesian Networks", the framing seems to make it clearer that the problem is that you used  for  -- but they're not the same thing! ... (read more)

I like this framing. This seems to imply that summarizing beliefs and summarizing updates are two distinct operations. For summarizing beliefs we can still resort to summing: ⎛⎜⎝p1p2p3⎞⎟⎠Belief→(p1p2+p3)Summarized belief But for summarizing updates we need to use an average - which in the absence of prior information will be a simple average: ⎛⎜⎝e1e2e3⎞⎟⎠Update→(e1e2+e32)Summarized update Annoyingly and as you point out this is not a perfect summary - we are definitely losing information here and subsequent updates will be not as exact as if we were working with the disaggregated odds. I still find it quite disturbing that the update after summarizing depends on prior information - but I can't see how to do better than this, pragmatically speaking.

I would guess that having that many courses of Paxlovid "in the system" would be about an order of magnitude too low for true non-scarcity. (See: how many vaccine doses needed to be in the system before you could assume that there was going to be adequate supply anywhere you might try to look?)

We could use this heuristic to estimate "true" prevalence: []

What's the breakdown of fields by whether they have a pre-print server or not? (Which of the ones most important to human progress are in the good state?)

I'm most familiar with economics, where there's no server, but there's a universally-journal-respected right to publish the pre-print on your personal site, which ends up in the "it's free if you Google for it" equilibrium in practice.

Tangent: if you're on a page summarizing a paper, the google scholar plugin [] will look for a full version for you and bring you there in one click.

Yeah, that all checks out from my publishing experiences with them. (I've co-authored one paper for an Elsevier journal and have another out for review with them.)

As I say in my reply to Viliam's comment nephew to yours: I'm confused by the OP's choice to present the profit margin figure so prominently in "Why does Sci-Hub exist?", not discuss the true objections about net-negative spending, and then choose comment guidelines that say "Aim to explain, not persuade". The margin is striking and persuasive, but (assuming they agree with your model of the worl... (read more)

I see; that helps to make sense of the 37% figure, thanks!

Given that explanation, though, I'm confused by the OP's choice to present the 37% number in the segment "Why does Sci-Hub exist?". Given that the number as presented isn't their true objection, it feels like a fact being presented to persuade, rather than to inform. Given that the first comment guideline (in the set they chose to apply) is "Aim to explain, not persuade", I don't understand this choice?

(Again, I don't have any problem with the content of the post; I just wanted to register confusion... (read more)

Hopefully my comment above makes this more clear now, but the 37% is supposed to imply the extremely strong pricing power / oligopoly position / lack of competition and that the true cost of production is more like 1-10% than 63% of their revenue. Perhaps I should have made this more clear. Anyways, I think this is in fact a major aspect of my true objection; if there were a bunch of small journals of academics competing and universities couldn't afford them, it would be less obvious the first step to take. Hopefully also clear now: I'm not trying to use arguments as soldiers here, but I am trying to quickly summarize the state of things from my perspective, and am not being incredibly careful with all my wordings. E.g. that whole section is not even accurately "why Sci-Hub exists"—Sci-Hub exists because Alexandra believes in open science. But it's trying to gesture at background reasons why Sci-Hub still exists. Many other mistakes like this—if I had a year I would have taken the time to write a better post.

I don't want to defend Elsevier's business practices, but I'm confused by the way some of the numbers are used in the argument against. I understand if the OP is too busy with the legal work right now to respond, but I'd be curious to know whether I'm missing something.

  1. India has 30x the financial issues with the academic publishing oligopoly as the US
  2. The olipology has such high pricing power and Elsevier has 37% operating profit margin

Take both of these numbers at face value -- let's say that Indian researchers face effective costs 30x higher than a... (read more)

What I meant here was not that the problem was a 37% surcharge, it was that the problems were all the ones associated with a 37% OPM oligopoly in science. First, Viliam had the right idea in the comment below—the costs rise to meet the revenue, and much of the "expenses" are going to be useless administrative bloat in a thousand different ways. The non-profit version could be run at about literally 1000x less cost: [] But again, the problem isn't so much the money wasted as the practices implied. To maintain 37% OPM as a large company with bloative force means you have some serious pricing power from lack of competition, which means inefficient monopoly pricing. And beyond the econ 101 case, their attempts at bundling mean even more monopoly inefficiency. I'm not especially clear on what should be done in the academic publishing world as a whole because I haven't been on the ground in the many attempts to change things. But I think most other options involve costs coming down by more like 90% than like 20%.

In many cases, it is being spent in net-negative ways.

For example, many journals have copyeditors that do the paper layout for you, but frequently introduce mistakes in mathematical notation and other subtle issues like that. Result: the author has to proofread the editor's work multiple times before giving their approval. A latex compiler does better work for free. A friend of mine literally spent months going back and forth with a for-profit journal over issues like this.

Why does Elsevier persist in this obviously inefficient way of doing things? because... (read more)

4Neel Nanda1y
+1, I was pretty surprised and confused by the 37% stat. If basically all of the labour here comes from taxpayer funded science, where on earth is 63% of the revenue going?!

Elsevier journals allow individual authors to make their own published paper "immediately and permanently free for everyone to read and download" for a fee. In the last Elsevier journal I submitted a paper to, the fee was $2,400.

I think this means that a grant conditioned on open-access publishing would just mean that authors will have to pay the fee if they publish in an Elsevier journal -- this makes it more like a tax (paid out of grant money) than a ban. Not sure if that would make it more or less effective, on net, though.

That issue is a good point; I think one variant that gets around it is one focused on pre-prints. As I understand it, some journals allow pre-prints and others don't. This basically fixes the problem for all fields with a pre-print server.

It is definitely impossible to (in general) determine whether a given program is equivalent to a specific Weird Program. This is a consequence of the halting problem itself!

I think the question about "statements I care about" is, at its core, a question about aesthetics and going to be kind of subjective. For example, does the above statement about not being able to prove the equivalence of programs qualify? (Or would it be non-interesting if one of the programs being compared were sufficiently weird?)

Another statement that might or might not qualify is of... (read more)

I would strongly expect CEO-ship of a S&P 500 company to be causally downstream of a top-5 business school, a top-20 college, and the kind of high-status professional+social network you get from "more and better school".

Came here to say this. It doesn't even depend on knowing the other player's value with certainty -- if you shift your submitted price by $1 in your favor, you might give up a trade worth <$0.5 (if the other player's price was between your true value and the new number), and you might improve your price by $0.5 (if a trade happens). Even if you don't know anything for sure, it seems much more likely that a trade happens than the other player's price being in exactly that dollar, so it's good for you to do the price shift.

Reasonable beliefs! I feel like we're mostly at a point where our perspectives are mainly separated by mood, and I don't know how to make forward progress from here without more data-crunching than I'm up for at this time.

Thanks for discussing!

The actual algorithm I followed was remembering that habryka posts them and going to his page to find the one he posted most recently. Not sure what the most principled way to find it is, though...

Welcome; glad to have you here!

Just so you know, this is the August 2020 thread, and the August 2021 thread is at -- alternatively, you could wait three days for habryka to post the September 2021 thread, which might see more traffic in the early month than the old thread does at the end of the month.

1Horatio Von Becker1y
Thanks. I'm also having account troubles, which will hopefully be sorted by then. (How'd you find the August 2021 thread, by the way? Latest I could find was July for some reason.)

I think of the Fama-French thesis as having two mostly-separate claims: (1) correlated factors create under-investment + excess return, and (2) the "right" factors to care about are these three -- oops five -- fundamentally-derived ones.

Like you, I'm pretty skeptical on the way (2) is done by F-F, and I think the practice of hunting for factors could (should) be put on much more principled ground.

It's worth keeping in mind, though, that (1) is not just "these features predict excess returns", but "these features have correlation, and that correlation arrow... (read more)

I think I would agree with you that if you could really find the "right" factors to care about because they capture predictable correlated variance in a sensible way, then we should accept those parts as "explainable". I just find that these FF betas are too unstable and arbitrary for my liking, which is a sentiment you seem to understand. I focus so much on the risk-return paradox because it is such a simple and consistent anomaly. Maybe one day that won't be true anymore, but I'm just more willing to accept that this phenomenon just exists as a quirk of the marketplace than that FF explains "part of it, and the rest looks like inefficiency". FF could just as easily be too bad a way to explain correlated variance to use in any meaningful way.

What do you mean by "demonstrate vaccine effectiveness"? My instinct is that it's going to be ~impossible to prove a casual result in a principled way just from this data. (This is different from how hard it will be to extract Bayesian evidence from the data.)

For intuition, consider the hypothesis that countries can (at some point after February 2020) unlock Blue Science, which decreases cases and deaths by a lot. If the time to develop and deploy Blue Science is sufficiently correlated with the time to develop and deploy vaccines (and the common component... (read more)

2Rafael Harth1y
I mean something like, "a result that would constitute a sizeable Bayesian update to a perfectly rational but uninformed agent". Think of someone who has never heard much about those vaccine thingies going from 50/50 to 75/25, that range.

Originally the Fama-French model only had 3 fundamental risk factors. If things don't quite work out after the first 3, it seems awfully ad-hoc to just find 2 more and then add them to the back. There also seems to be a belief in academia that getting higher risk adjusted returns through analysis of company fundamentals is more possible than getting them through historical price data.

I'm a bit confused here -- the core Fama-French insight is that if a given segment of the market have a large common correlation, then it'll be under-invested in by investo... (read more)

It seems ad hoc to me because they continue to add "fundamental" factors to their model, instead of accepting that the risk-return paradox just existed. Why accept 5 fundamental factors when you could just accept one technical factor? Suppose that in 20 years we discover that although currently in 2021 we are able to explain the risk-return paradox with 5 factors and transaction costs, the risk-return paradox still exists despite 5 factors in this new out of sample data from the future. What do we do then? Find 2 more factors? Or should we just conclude that the market for the period of time up until then was just not efficient in a weak-form sense?

Related: suggests that:

  1. cold-chain requirements had more margin for error than any of us had thought
  2. the process that produced excess margin for error in this case likely produced excess margin for error in other relevant areas

(tentative) Should this make less plausible a line of reasoning that goes through "Except sometimes vaccines are left at high temperature for too long, the delicate proteins are damaged, and people receiving them are effectively not ... (read more)

That makes sense, and now you mention it, I heard the same about injectable monoclonal antibodies. At first they thought a few hours at room temperature would destroy them, turned out that's probably false.

You can also get a fair number of points for just predicting the community prediction — but you won't get that many because as a question's point value increases (which it does with the number of predictions), more and more of the score is relative rather than absolute.

I think this is actually backwards (the value goes up as the question's point value increases), because the relative score is the component responsible for the "positive regardless of resolution" payoffs. Explanation and worked example here:

Clever, but it hasn't been tried for a good reason. If, say, the next five years of markets are all untethered from reality (but consistent with each other), there's no way to get paid for bringing them into line with expected reality except by putting on the trades and holding them for five years. (The natural one-year trade will just resolve to the unfair market price of the next-year-market market and there's nothing to do about it except wait for longer.)

The chained markets end up being no more fair than if they all settled to the final expiry directly.

Yes, I can imagine cases where this setup wouldn't be enough. Though note that you could still buy the shares the last year. Also, if the market corrects by 10% each year (i.e., a value of a share of yes increases from 10 to 20% to 30% to 40%, etc. each year), it might still be worth it (note that the market would resolve each year to the value of a share, not to 0 or 100). Also note that the current way in which prediction markets are structured is, as you point out, dumb: you bet 5 depreciating dollars which then go into escrow, rather than $5 worth of, say, S&P 500 shares, which increase in value. But this could change.

One can avoid a wealth tax by living in another country.

I don't understand why this is necessarily true. What would stop the US from levying a wealth tax on US persons living abroad?

When will air travel from New York to Hong Kong no longer require arriving passengers to self-quarantine?

It's worse than that. If there weren't any shares available at your broker for you to short-sell in the market, you should consider it likely that instead of paying 0.4%/day, you just are told you have to buy shares to cover your short from assignment. This is an absolutely normal thing that happens sometimes when it's hard to find additional people to lend stock (which is happening now).

(Disclaimer: I am a financial professional, but I'm not a financial advisor, much less yours.)


Strictly speaking, they're not both laptop chargers, but laptop/phone/USB-C chargers. So two of them are useful on the road for charging laptop and phone simultaneously.

Not a physical object, but the Cloud9 IDE (now absorbed into Amazon's AWS suite) for programming work. If your work fits into a terminal plus text editor (which it probably does), then making the actual hardware be a cloud server instead of a laptop that can run out of charge is a big win, and being able to access your "real" machine from different interface machines is sometimes useful.

For the interface laptop itself, I've been very, very happy with a Google Pixelbook (which I got after many, many satisfied years with the original Chromebook Pixel), but that depends on whether you have tasks outside the browser, terminal, or text editor.

Separate travel toiletries from at-home toiletries. (The biggest win is not having to unpack exactly when you get home tired.) Similarly, separate travel phone/laptop chargers from at-home chargers, for same reason.

I haven't yet gone all the way to a separate set of travel clothes, but would like to, one of these years. The 80/20 version is making sure to lay out 1-2 full sets of clothes before going on the long trip.

(For reference, I spent maybe five weeks of 2019 traveling, though naturally 2020 has been much less than that.)

Ahem: a fifth laptop/USB-C charger. (One each for my couch, desk, and bedroom; two stay packed in my travel luggage.)

h/t to Zvi for making this suggestion in Dual Weilding, under the general heading of More Dakka.

I think the advice would be best phrased as, "nth laptop charger," where n is the number of locations you use your laptop regularly. For me, one at home, one at work and one in my bag is sufficient. PS: why do you pack two in your travel luggage? Just in case one gets lost/left behind in a hotel room?

The first two regulations have reference prices for wheat that differ by 50%. How far apart in time were they issued?

They're the same regulation. It has the form "when wheat costs X, farthing bread should weigh Y" for many values of X

I'm being unnecessarily oblique in the above comment, for which I'm sorry.

What I mean is, in a taxable account, you have the option to donate winners and harvest capital losses on losers. In a post-donation investment vehicle like a DAF, you don't have that optionality. (Compared to a taxable account, your treatment on winners also comes out to no capital gains tax, but your treatment on losers is worse, with no harvesting losses.)

(not tax or investment advice)

It's worth mentioning that this is generally a bad idea in the US tax regime (despite being trivially easy), because the options for handling capital gains and losses differently mean you can sometimes do better with pre-donation investments than with post-donation investments.

(I'm a finance professional, but no one's tax or investment advisor, much less your tax or investment advisor.)

Can you go into more detail about this, or link to an article?

Having lived in New York (but only having visited LA), the difference in city design that is immediately salient to me is the presence/absence of the Subway.

According to MIT health economist Jeffrey Harris, the subways seeded the massive coronavirus epidemic in New York City:

New York City’s multitentacled subway system was a major disseminator – if not the principal transmission vehicle – of coronavirus infection during the initial takeoff of the massive epidemic that became evident throughout the city during March 2020. The near shutoff of subway riders

... (read more)

Higher variance is worth avoiding (under standard assumptions), but I for one was surprised by how little additional variance one takes on by allocating, say, 10% of one's portfolio to a single arbitrary bet. In this comment I ballparked it at maybe an extra 0.5% variance.

That said, allocating one's entire portfolio this way basically requires a rejection of the standard risk-budget assumptions.

(Disclaimer: I'm a financial professional, but I'm not anyone's investment advisor, much less yours.)

I'm not sil ver, but as a casual coronavirus watcher (in part because I live significantly closer to affected areas than most), my instinctive doubts are mostly 1 and 3. What numbers are you using for those to base the claim "ankifying this is probably among the most valuable things you could ever use it for."?

In what sense are you using the word "trilemma"? I'm either not familiar with the usage or missing a big message of the post.

(The common definition of "trilemma" I'm most familiar with presents three desiderata, of which it's possible to achieve at most two.)

However, the post assumes that 1) there is (or should be) one correct answer, 2) which is of the form: (1, 0, 0, 0) or a permutation thereof, and 3) the material is independent of the system (does not include probability, for example).

These are assumed for the sake of explanation, but none are necessary; in fact, the scoring rule and analysis go through verbatim if you have questions with multiple answers in the form of arbitrary vectors of numbers, even if they have randomness. The correct choice is still to guess, for each potential answer, your expectation of that answer's realized result.

Load More