Thanks, really appreciate the references!
If there was a feasible way to make the algorithm open, I think that would be good (of course FB would probably strongly oppose this). As you say, people wouldn't directly design / early adopt new algorithms, but once early adopters found an alternative algorithm that they really liked, word of mouth would lead many more people to adopt it. So I think you could eventually get widespread change this way.
Thanks for the feedback!
I haven't really digged into Gelman's blog, but the format you mention is a perfect example of the expertise of understanding some research. Very important skill, but not the same as actually conducting the research that goes into a paper.
Research consists of many skills put together. Understanding prior work and developing the taste to judge it is one of the more important individual skills in research (moreso than programming, at least in most fields). So I think the blog example is indeed a central one.
In research, especially in
Thanks, sounds good to me!
Actually, another issue is that unsupervised translation isn't "that hard" relative to supervised translation--I think that you can get pretty far with simple heuristics, such that I'd guess making the model 10x bigger matters more than making the objective more aligned with getting the answer right (and that this will be true for at least a couple more 10x-ing of model size, although at some point the objective will matter more).
This might not matter as much if you're actually outputting explanations and not just translating from one language to another. Although it is probably true that for tasks that are far away from the ceiling, "naive objective + 10x larger model" will outperform "correct objective".
Thanks Paul, I generally like this idea.
Aside from the potential concerns you bring up, here is the most likely way I could see this experiment failing to be informative: rather than having checks and question marks in your tables above, really the model's ability to solve each task is a question of degree--each table entry will be a real number between 0 and 1. For, say, tone, GPT-3 probably doesn't have a perfect model of tone, and would get <100% performance on a sentiment classification task, especially if done few-shot.
The issue, then, is that the ... (read more)
This doesn't seem so relevant to capybaralet's case, given that he was choosing whether to accept an academic offer that was already extended to him.
I think if you account for undertesting, then I'd guess 30% or more of the UK was infected during the previous peak, which should reduce R by more than 30% (the people most likely to be infected are also most likely to spread further), and that is already enough to explain the drop.
I wasn't sure what you meant by more dakka, but do you mean just increasing the dose? I don't see why that would necessarily work--e.g. if the peptide just isn't effective.
I'm confused because we seem to be getting pretty different numbers. I asked another bio friend (who is into DIY stuff) and they also seemed pretty skeptical, and Sarah Constantin seems to be as well: https://twitter.com/s_r_constantin/status/1357652836079837189.
Not disbelieving your account, just noting that we seem to be getting pretty different outputs from the expert-checking process... (read more)
Ah got it, thanks!
Have you run this by a trusted bio expert? When I did this test (picking a bio person who I know personally, who I think of as open-minded and fairly smart), they thought that this vaccine is pretty unlikely to be effective and that the risks in this article may be understated (e.g. food grade is lower-quality than lab grade, and it's not obvious that inhaling food is completely safe). I don't know enough biology to evaluate their argument, beyond my respect for them.
I'd be curious if the author, or others who are considering trying this, have applied this... (read more)
In my case, yes. My bio expert indicated that it was likely to be effective (more than 50%, but less than 90%) and that the risks were effectively zero in terms of serious complications.
Regarding the food grade versus lab grade question, as well as inaccuracies or mistakes in construction of the vaccine, this was a question I spent a reasonable amount of time on. The TL/DR is that the engineering tolerances are incredibly wide; the molecular weight of the chitosan isn't that important, the mixing rate isn't that important other than it be fast ... (read more)
I don't think I was debating the norms, but clarifying how they apply in this case. Most of my comment was a reaction to the "pretty important" and "timeless life lessons", which would apply to Raemon's comment whether or not he was a moderator.
Often, e.g. Stanford profs claiming that COVID is less deadly than the flu for a recent and related example.
Hmm, important as in "important to discuss", or "important to hear about"?
My best guess based on talking to a smart open-minded biologist is that this vaccine probably doesn't work, and that the author understates the risks involved. I'm interpreting the decision to frontpage as saying that you think I'm wrong with reasonably high confidence, but I'm not sure if I should interpret it that way.
You should make a top-level comment about this. Chance that the vaccine works and the associated risks are object-level questions well-worth discussing.
In general, frontpage decisions are not endorsements (though I don't know Raemon's thoughts in this particular case), and this comment section is not the place for a debate about frontpaging norms. This is definitely the place to talk about chance the vaccine works and associated risks, though.
That seems irrelevant to my claim that Zvi's favored policy is worse than the status quo.
This isn't based on personal anecdote, sudies that try to estimate this come up with 3x. See eg the MicroCovid page: https://www.microcovid.org/paper/6-person-risk
You may well be right. I guess we don't really know what the sampling bias is (it would have to be pretty strongly skewed towards incoming UK cases though to get to a majority, since the UK itself was near 50%).
See here: https://cov-lineages.org/global_report.html
I don't think it's correct to say that it remains stable at 0.5-1% of samples in Denmark. There were 13 samples of the new variant last week, vs. only 3 two weeks ago, if I understood the data correctly. If it went from 0.5% to 1% in a week then you should be alarmed. (Although 3 and 13 are both small enough that it's hard to compute a growth rate, but it certainly seems consistent with the UK data to me.)I think better evidence against non-infectiousness would be Italy and Israel, where the variant seems to be dominant but there isn't runaway growth. But:... (read more)
Zvi, I still think that your model of vaccination ordering is wrong, and that the best read of the data is that frontline essential workers should be very highly prioritized from a DALY / deaths averted perspective. I left this comment on the last thread that explains my reasoning in detail, looking at both of the published papers I've seen that model vaccine ordering: link. I'd be happy to elaborate on it but I haven't yet seen anyone provide any disagreement.
More minor, but regarding rehab facilities, from a bureaucratic perspective they are "congregate ... (read more)
Zvi, I agree with you that the CDC's reasoning was pretty sketchy, but I think their actual recommendation is correct while everyone else (e.g. the UK) is wrong. I think the order should be something like:
Nursing homes -> HCWs -> 80+ -> frontline essential workers -> ...
(Possibly switching the order of HCWs and 80+.)
The public analyses saying that we should start with the elderly are these two papers:
Notably, both p... (read more)
Mo Bamba (NBA) and Cody Garbrandt (UFC) are both pro athletes who are still out of commission months later. I found this looking for NBA information, and only about 50 NBA players have gotten Covid, so this suggests at least 2% chance of pretty bad long term symptoms.
I think that the right amount level of effort leaves you tired but warm inside, like you look forward doing this again, rather than just feeling you HAVE to do this again.
This is probably true in a practical sense (otherwise you won't sustain it as a habit), but I'm not sure it describes a well-defined level of effort. For me an extreme effort could still lead to me looking forward to it, if I have a concrete sense of what that effort bought me (maybe I do some tedious and exhausting footwork drills, but I understand the sense in which this will carr... (read more)
If most workouts are painful, then I agree you are probably overtraining. But if no workouts at all are painful, you're probably missing opportunities to improve. And many workouts should at least be uncomfortable for parts of it. E.g. when lifting, for the last couple deadlift sets I often feel incredibly gassed and don't feel like doing another one. But this can be true even when I'm far away from my limits (like, a month later I'll be lifting 30 pounds more and feel about as tired, rather than failing to do the lift).
My guess is that on average 1-2 work... (read more)
You could look at papers published on medrxiv rather than news articles, which would resolve the clickbait issue, though you'd still have to assess the study quality.
Have you tried googling yourself and were unable to find them? (Sorry that I'm too lazy to re-look them up myself, but given that LW is mostly leisure for me I don't feel like doing it, and I'd be somewhat surprised if you googled for stuff and didn't find it.)
I also think you are probably overestimating vaccine risks (the main risk is that its effectiveness wanes, and that it interferes with future antibody responses from similar vaccines; not that you'll get horrible side effects) but that isn't necessary to explain why people want the vaccine now.
I think cutting the IFR by 25 on the basis of one study is a mistake, the chance of the study being fatally flawed is greater than 1 in 25. On the other hand 0.5% is overall CFR and would be lower for young people.
I think it's hard to cut risk of long term effects by more than a factor of 10 from published estimates. Note there is evidence of long term effects contrary to your claim, i.e. studies that do 6 week follow ups and find people still with some symptom. This isn't 6 months but is still surprisingly long and should shift our belief about 6 months a... (read more)
I noticed the prudishness, but "rudeness" to me parses as people actually telling you what's on their mind, rather than the passive-aggressive fake niceness that seems to dominate in the Bay Area. I'll personally take the rudeness :).
On the other hand, the second-best place selects for people who don't care strongly about optimizing for legible signals, which is probably a plus. (An instance of this: In undergrad the dorm that, in my opinion, had the best culture was the run-down dorm that was far from campus.)
Many of the factors affecting number of deaths are beyond a place's control, such as how early on the pandemic spread to that place, and how densely populated the city is. I don't have a strong opinion about MA but measuring by deaths per capita isn't a good way of judging the response.
That's not really what a p-value means though, right? The actual replication rate should depend on the prior and the power of the studies.
What are some of the recommendations that seem most off base to you?
My prediction: infections will either go down or only slowly rise in most places, with the exception of one or two metropolitan areas. If I had to pick one it would be LA, not sure what the second one will be. The places where people are currently talking about spikes won't have much correlation with the places that look bad two weeks from now (i.e. people are mostly chasing noise).
I'm not highly confident in this, but it's been a pretty reliable prediction for the past month at least...
Here is a study that a colleague recommends: https://www.medrxiv.org/content/10.1101/2020.05.03.20089854v3. Tweet version: https://mobile.twitter.com/gidmk/status/1270171589170966529?s=21
Their point estimate is 0.64% but with likely heterogeneity across settings.
I don't think bubble size is the right thing to measure; instead you should measure the amount of contract you have with people, weighted by time, distance, indoor/outdoor, mask-wearing, and how likely the other person is to be infected (I.e. how careful they are).
An important part of my mental model is that infection risk is roughly linear in contact time.
As a background assumption, I'm focused on the societal costs of getting infected, rather than the personal costs, since in most places the latter seem negligible unless you have pre-existing health conditions. I think this is also the right lens through which to evaluate Alameda's policy, although I'll discuss the personal calculation at the end.
From a social perspective, I think it's quite clear that the average person is far from being effectively isolated, since R is around 0.9 and you can only get to around half of that via only household infection. S
I think the biggest issue with the bubble rule is that the math doesn't work out. The secondary attack rate between house members is ~30% and probably much lower between other contacts. At that low of a rate, these games with the graph structure buy very little and may be harmful because they increase the fraction of contact occurring between similar people (which is bad because the social cost of a pair of people interacting is roughly the product of their infection risks).
I'm not trying to intimidate; I'm trying to point out that I think you're making errors that could be corrected by more research, which I hoped would be helpful. I've provided one link (which took me some time to dig up). If you don't find this useful that's fine, you're not obligated to believe me and I'm not obligated to turn a LW comment into a lit review.
The CFR will shift substantially over time and location as testing changes. I'm not sure how you would reliably use this information. IFR should not change much and tells you how bad it is for you personally to get sick.
I wouldn't call the model Zvi links expert-promoted. Every expert I talked to thought it had problems, and the people behind it are economists not epidemiologists or statisticians.
For IFR you can start with seroprevalence data here and then work back from death rates: https://twitter.com/ScottGottliebMD/status/1268191059009581056
R... (read more)
Ben, I think you're failing to account for under-testing. You're computing the case fatality rate when you want the infection fatality rate. Most experts, as well as the well-done meta analyses, place the IFR in the 0.5%-1% range. I'm a little bit confused why you're relying on this back of the envelope rather than the pretty extensive body of work on this question.
I don't understand why this is evidence that "EA Funds (other than the global health and development one) currently funges heavily with GiveWell recommended charities", which was Howie's original question. It seems like evidence that donations to OpenPhil (which afaik cannot be made by individual donors) funge against donations to the long-term future EA fund.
I like the general thrust here, although I have a different version of this idea, which I would call "minimizing philosophical pre-commitments". For instance, there is a great deal of debate about whether Bayesian probability is a reasonable philosophical foundation for statistical reasoning. It seems that it would be better, all else equal, for approaches to AI alignment to not hinge on being on the right side of this debate.
I think there are some places where it is hard to avoid pre-commitments. For instance, while this isn't quite a philo... (read more)
FWIW I understood Zvi's comment, but feel like I might not have understood it if I hadn't played Magic: The Gathering in the past.
EDIT: Although I don't understand the link to Sir Arthur's green knight, unless it was a reference to the fact that M:tG doesn't actually have a green knight card.
Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)
I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:
Very minor nitpick, but just to add, FLI is as far as I know not formally affiliated with MIT. (FHI is in fact a formal institute at Oxford.)
I enjoy reading your posts because they often consist of clear explanations of concepts I wish more people appreciated. But I think this is the first instance where I feel I got something that I actually hadn't thought about before at all, so I wanted to convey extra appreciation for writing it up.
I think the conflation is "decades out" and "far away".
Galfour was specifically asked to write his thought up in this thread: https://www.lesserwrong.com/posts/BEtzRE2M5m9YEAQpX/there-s-no-fire-alarm-for-artificial-general-intelligence/kAywLDdLrNsCvXztL
It seems either this was posted to the wrong place, or there is some disagreement within the community (e.g. between Ben in that thread and the people downvoting).
Points 1-5 at the beginning of the post are all primarily about community-building and personal development externalities of the project, and not about the donation itself.