Anders_H

I found the original website for Prof. Lipsitch's "Cambridge Working Group" from 2014 at http://www.cambridgeworkinggroup.org/ . While the website does not focus exclusively on gain-of-function, this was certainly a recurring theme in his public talks about this.

The list of signatories (which I believe has not been updated since 2016) includes several members of our community (apologies to anyone who I have missed):

- Toby Ord, Oxford University
- Sean O hEigeartaigh, University of Oxford
- Daniel Dewey, University of Oxford
- Anders Sandberg, Oxford University
- Anders Huitfeldt, Harvard T.H. Chan School of Public Health
- Viktoriya Krakovna, Harvard University PhD student
- Dr. Roman V. Yampolskiy, University of Louisville
- David Manheim, 1DaySooner

Interestingly, there was an opposing group arguing in favor of this kind of research, at http://www.scientistsforscience.org/. I do not recognize a single name on their list of signatories

Here is a video of Prof. Lipsitch at EA Global Boston in 2017. I haven't watched it yet, but I would expect him to discuss gain-of-function research: https://forum.effectivealtruism.org/posts/oKwg3Zs5DPDFXvSKC/marc-lipsitch-preventing-catastrophic-risks-by-mitigating

Here is a data point not directly relevant to Less Wrong, but perhaps to the broader rationality community:

Around this time, Marc Lipsitch organized a website and an open letter warning publicly about the dangers of gain-of-function research. I was a doctoral student at HSPH at the time, and shared this information with a few rationalist-aligned organizations. I remember making an offer to introduce them to Prof. Lipsitch, so that maybe he could give a talk. I got the impression that the Future of Life Institute had some communication with him, and I see from their 2015 newsletter that there is some discussion of his work, but I am not sure if anything more concrete came out of of this

My impression was that while they considered this important, this was more of a catastrophic risk than an existential risk, and therefore outside their core mission.

This comment touches on the central tension between the current paradigm in medicine, i.e. "evidence-based medicine" and an alternative and intuitively appealing approach based on a biological understanding of mechanism of disease.

In evidence-based medicine, decisions are based on statistical analysis of randomized trials; what matters is whether we can be confident that the medication probabilistically has improved outcomes when tested on humans as a unit. We don't care really care too much about the mechanism behind the causal effect, just whether we can be sure it is real.

The exaggerated strawman alternative approach to EBM would be Star Trek medicine, where the ship's doctor can reliably scan an alien's biology, determine which molecule is needed to correct the pathology, synthesize that molecule and administer it as treatment.

If we have a complete understanding of what Nancy Cartwright calls "the nomological machine", Star Trek medicine should work in theory. However, you are going to need a very complete, accurate and detailed map of the human body to make it work. Given the complexity of the human body, I think we are very far from being able to do this in practice.

There have been many cases in recent history where doctors believed they understood biology well enough to predict the consequences, yet were proved wrong by randomized trials. See for example Vinay Prasad's book "Ending Medical Reversal".

My personal view is that we are very far from being able to ground clinical decisions in mechanistic knowledge instead of randomized trials. Trying to do so would probably be dangerous given the current state of biological understanding. However, we can probably improve on naive evidence-based medicine by carving out a role for mechanistic knowledge to complement data analysis. Mechanisms seems particularly important for reasoning correctly about extrapolation, the purpose of my research program is to clarify one way such mechanisms can be used. It doesn't always work perfectly, but I am not aware of any examples where an alternative approach works better.

Thank you so much for writing this! Yes, this is mostly an accurate summary of my views (although I would certainly phrase some things differently). I just want to point out two minor disagreements:

- I don't think the problem is that doctors are too rushed to do a proper job, I think the patient-specific data that you would need is in many cases theoretically unobservable, or at least that we would need a much more complete understanding of biological mechanisms in order to know what to test the patients for in order to make a truly individualized decision. At least for the foreseeable future, I think it will be impossible for doctors to determine which patients will benefit on an individual level, they will be constrained to using the patient's observables to put them in a reference group, and then use that reference group to predict risk based on observations from other patients in the same reference group
- I am not entirely convinced that the Pearlian approach is the most natural way to handle this. In the manuscript, I use "modern causal models" as a more general term that also includes other types of counterfactual causal models. Of course, all these models are basically isomorphic, and Cinelli/Pearl did show in response to my last paper that it is possible to do the same thing using DAGs. I am just not at all convinced that the easiest way to capture the relevant intuition is to use the Pearl's graphical representation of the causal models.

You are correct that someone who has one allergy may be more likely to have an other allergy, and that this violates the assumptions of our model. Our model relies on a strong independence assumption, there are many realistic cases where this independence assumption will not hold. I also agree that the video uses an example where the assumption may not hold. The video is oversimplified on purpose, in an attempt to get people interested enough to read the arXiv preprint.

If there is a small correlation between baseline risk and effect of treatment, this will have a negligible impact on the analysis. If there is a moderate correlation, you will probably be able to bound the true treatment effect using partial identification methods. If there is strong correlation, this may invalidate the analysis completely.

The point we are making is not that the model will always hold exactly. Any model is an approximation. Let's suppose we have three choices:

- Use a template for a causal model that "counts the living", think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
- Use a template for a causal model that "counts the dead", think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
- Use a model that is invariant to whether you count the living or the dead. This cannot be based on a multiplicative (relative risk) parameter.

The third approach will not be sensitive to the particular problems that I am discussing, but all the suggested methods of this type have their own problems. I have written this earlier, my view is that these problems are more troubling than the problems with the relative risk models.

What we are arguing in this preprint, is that if you decide to go with a relative risk model, you should choose between (1) and (2) based on the principles suggested by Sheps, and then reason about problems with this model and how it can be addressed in the analysis, based on the principles that you have correctly outlined in your comment.

I can assure you that if you decide to go with a multiplicative model but choose the wrong "base case", then all of the problems you have discussed in your comments will be orders of magnitude more difficult to deal with in any meaningful way. In other words, it is only after you make the choice recommended by Sheps that it even becomes possibly the meaningfully analyze the reasons for deviation from effect homogeneity...

I very emphatically disagree with this.

You are right that once you have a prediction for risk if untreated, and a prediction risk if treated, you just need a cost/benefit analysis. However, you won't get to that stage without a paradigm for extrapolation, whether implicit or explicit. I prefer making that paradigm explicit.

If you want to plug in raw experimental data, you are going to need data from people who are exactly like the patient in every way. Then, you will be relying on a paradigm for extrapolation which claims that the *conditional counterfactual risks* (rather than the magnitude of the effect) can be extrapolated from the study to the patient. It is a different paradigm, and one that can only be justified if the conditioning set includes every cause of the outcome.

In my view, this is completely unrealistic. I prefer a paradigm for extrapolation that aims to extrapolate the scale-specific magnitude of the effect. If this is the goal, our conditioning set only needs to include those covariates that predict the magnitude of the effect of treatment, which is a small subset of all covariates that cause the outcome.

On this specific point, my view is consistent with almost all thinking in medical statistics, with the exception of some very recent work in causal modeling (who prefer the approach based on counterfactual risks). My disagreement with this work in causal modeling is at the core of my last discussion about this on Less Wrong. See for example "Effect Heterogeneity and External Validity in Medicine" and the European Journal of Epidemiology paper that it links to

Suppose you summarize the effect of a drug using a relative risk (a multiplicative effect parameter relating the probability of the event if treated with the probability of the event if untreated), and consider this multiplicative parameter to represent the "magnitude of the effect"

The natural thing for a clinician to do will be to assume that the magnitude of the effect is the same in their own patients. They will therefore rely on this specific scale for extrapolation from the study to their patients. However, those patients may have a different risk profile.

When clinicians do this, they will make different predictions depending on whether the relative risk is based on the probability of the event, or the probability of the complement of the event.

Sheps' solution to this problem is the same as mine: If the intervention results in a decrease to the risk of the outcome, you should use the probability of the event to construct the relative risk, whereas if the intervention increases the risk of the event, you should use the probability of the complement of the event

No. This is **not** about interpretation of probabilities. It is about choosing what aspect of reality to rely on for extrapolation. You will get different extrapolations depending on whether you rely on a risk ratio, a risk difference or an odds ratio. This will lead to real differences in predictions for what happens under intervention.

Even if clinical decisions are entirely left to an algorithm, the algorithm will need to select a mathematical object to rely on for extrapolation. The person who writes the algorithm needs to tell the algorithm what to use, and the answer to that question is contested. This paper contributes to that discussion, and proposes a concrete solution. One that has been known for 65 years, but never used in practice.

I don't think the existence of lawlike phenomena is controversial, at least not on this forum. Otherwise, how do you account for the remarkable patterns to our observations? Of course, it is not possible to determine what those phenomena are, but I don't think my solution requires this. It just requires that our sensory algorithm responds the same way every time.