Reply to a fertility doctor concerning polygenic embryo screening

GeneSmith

New LessWrong user grll_nrg, a fertility doctor, left a comment on my post about how to have polygenically screened children that brought up many of the common objections raised to polygenic embryo screening. I've heard these concerns brought up many times at conferences and in talks by professionals in the fertility industry. I thought other people might be interested in the discussion, so I decide to make a stand-alone post.

Here's grll_nrg's original comment:

Great post. Thank you. Fertility doctor here and a supporter of ART (assisted reproductive technologies) in general. A few thoughts (although you touched on a few of these below, worth emphasizing in my opinion):
PGT-P has not been validated yet, which may take decades to do, if ever.
The science in terms of GWAS isn't quite there yet IMHO - we don't know all the genes that are important for most traits and we may be inadvertently selecting against some desirable traits, for example.
Comparing clinic success rates using CDC data is imperfect because of different patient characteristics, patient selection, and reporting bias.
IVF pregnancies carry a significantly higher complication rate (hypertensive disorders, preterm birth, placental abnormalities, etc.) compared to spontaneous pregnancies - unclear if this is due to IVF or underlying infertility diagnosis.
The risk-benefit calculus of PGT-P is going to be different for a couple who already needs to do IVF anyway to have a baby (low additional risk/cost) compared to a couple doing IVF just so that they can do PGT-P (higher additional risk/cost).
IVF is notoriously inefficient at present. Depending on female partner age, each cycle may yield only very few embryos making the benefit and utility of PGT-P limited. It may not be practical, safe, or financially feasible to do multiple cycles of IVF to increase the cohort of transferable embryos.
IVF is expensive and often not covered by insurance which creates access disparities. PGT-P would exacerbate these disparities in access. This is not unique to IVF I realize.
Slippery-slope eugenics and discrimination are real ethical concerns that would need to be mitigated.
In-vitro gametogenesis (IVG) would be a game-changer. The utility of PGT-P would be greatly enhanced if suddenly you had thousands of eggs and hundreds of embryos to select from.

Thanks for the reply. I'm glad professionals from the ART field are reading this.

PGT-P has not been validated yet, which may take decades to do, if ever.

I think the ART field should probably reconsider what it considers acceptable evidence of "validation". In my mind, the question of "Has PGT-P been validated" should hinge on whether or not we can be confident that embryos selected via polygenic scores will display different trait values than those selected at random. For example, we want to know whether an embryo that has a low polygenic risk score for hypertension will indeed go on to develop hypertension at a lower rate.

All the people I've heard criticize PGT-P seem to think that the ONLY way to do this is with some kind of randomized control where some embryos with certain polygenic risk scores are implanted and others are not, and we then wait 20-70 years to see whether or not the polygenically screened group develops diseases at a different rate than the control group. I think this view is incorrect and is only taken because people are blindly applying traditional validation methodology to PGT-P without asking whether it is necessary.

Genes have a very special property that gives us a huge advantage over researchers trying to test whether or not a medication works; nature has already conducted a randomized control trial for us!

From the section in the post titled “correlation or causation”:

"OK", you might say. "That's well and good, but how do we know that these genetic differences are actually CAUSING someone to be taller or smarter rather than just spuriously correlated with height?"

The main reason this is possible is because nature has already conducted a randomized control trial on our behalf. Every time your body produces a sperm or egg cell, your DNA is more or less randomly mixed up and half of it is given to the reproductive cell. This means that, conditional on parental genomes, sibling genomes are randomized!
In turn, this means that if a gene can predict differences between siblings, you can be quite confident that it is in fact CAUSING the difference. This is actually quite a remarkable fact, and one that underpins the entire reason for believing embryo selection should work.
There is one asterisk here; though a sibling GWAS can tell you where the causal variant is, it usually can only narrow down the list of candidates to perhaps 10 distinct variants within a region of very roughly 100,000 base pairs. This is sufficient for embryo selection because that set of 10 base pairs will almost always be inherited together. But if sometime down the line we want to do embryo editing, it will require us to either pinpoint the causal variant precisely or to edit all 10 variants that have a decent chance of causing the observed change.

Unsurprisingly, the folks over at Genomic Prediction have already conducted this analysis. And they aren’t the only ones; Lencz et al also conducted an analysis showing large reductions in Crohn’s Disease and schizophrenia from embryo selection (though it should be noted they used simulated offspring rather than real ones). So we know with very high confidence that polygenic embryo screening does actually decrease disease risk.

I’ve brought this point up over and over again in conversations about validation of PGT-P and yet I have never seen it addressed by those who claim PGT-P hasn’t been validated. If your predictor works in siblings IT HAS ALREADY UNDERGONE A HUGE RANDOMIZED CONTROL TRIAL.

Now there are some other issues to address; changes in the environment, for example, might break the link between a subset of the genes and the phenotype we measure. For example, it’s not too hard to imagine a gene that increases one’s height, but only if adequate calories are consumed. We might expect such a gene to show a significant effect on height in South Koreans, but not North Koreans born during the famine of the late 90s.

I have a hard time believing this would have more than a small effect, but if you know of any studies that suggest it would I’d be happy to read them and update the post accordingly. I am having a surprisingly hard time finding studies on the topic through google.

The science in terms of GWAS isn't quite there yet IMHO - we don't know all the genes that are important for most traits and we may be inadvertently selecting against some desirable traits, for example.

We don’t NEED to know all of the genes that are important for most traits. We merely need a predictor strong enough to make a significant reduction in disease risk possible. And we already have such predictors; most of the PRS used by Genomic Prediction have an AUC of 0.55-0.75, which allow for an expected relative risk reduction of 10-40% when selecting the best of 5 embryos. Furthermore, due to the correlation between polygenic risk scores, you can get MULTIPLE SIMULTANEOUS REDUCTIONS (at least in expectation).

If you have actual concrete reasons to doubt the results of their research and that done by other independent researchers like Shai Carmi and Todd Lencz, I would be happy to hear them. As for the concern about inadvertently selecting against some desirable traits, I agree that this is at least a small concern.

But there is a clear solution; just add everything to your selection index. Select against disease risk, chronic pain risk, mental disorders, etc, and FOR intelligence, attractiveness, kindness, longevity, etc.

I don’t think selecting JUST against disease risk is likely to negatively trade-off against non-disease traits to any appreciable degree. In fact, due to the genetic correlation between low disease risk and high intelligence (and other positive traits), I would expect the opposite; there is probably some small positive effect on such traits from selecting against disease.

Comparing clinic success rates using CDC data is imperfect because of different patient characteristics, patient selection, and reporting bias.

True, a randomized control trial would be much better. However, in the absence of such a trial, the best patients can do is try to control for potential confounders, which is what my clinic comparison did (at least for inter-clinic differences in maternal age).

reporting bias

My understanding is that almost every clinic in the US reports virtually all of their cycles to the CDC. If this is wrong, please correct me.

And beyond all of that, what else are patients supposed to do? What’s the point of even reporting all this data to the CDC if people then turn around and tell patients that it doesn’t mean anything? Isn’t the entire point of collecting this data to allow patients to make an informed choice about which clinic they want to use?

The default alternative is patients looking at Google reviews or asking for recommendations in IVF support groups. My analysis may have its shortcoming, but at least it’s informed by more than the one or two data points that most patients use to pick a clinic.

IVF pregnancies carry a significantly higher complication rate (hypertensive disorders, preterm birth, placental abnormalities, etc.) compared to spontaneous pregnancies - unclear if this is due to IVF or underlying infertility diagnosis.

Yes, I’ve discussed this quite a bit in other comments. There are no randomized control trials to definitively answer this question but my impression from reading the literature is any negative effects from IVF are quite small compared to the benefits of embryo selection.

The risk-benefit calculus of PGT-P is going to be different for a couple who already needs to do IVF anyway to have a baby (low additional risk/cost) compared to a couple doing IVF just so that they can do PGT-P (higher additional risk/cost).

Agreed. PGT-P for couples already doing IVF is a no-brainer if they have more embryos than they want children. It’s much more of an open question for those that don’t already need IVF, but my view is that if a couple can afford it and doesn’t have a better use for the money, they should do it. It’s what I plan to do when I have kids.

IVF is notoriously inefficient at present. Depending on female partner age, each cycle may yield only very few embryos making the benefit and utility of PGT-P limited. It may not be practical, safe, or financially feasible to do multiple cycles of IVF to increase the cohort of transferable embryos.

Again, this is addressed in the post. I agree that it doesn’t make sense to do IVF for the benefits of PGT-P in many cases, particularly if the woman is over 35. And the cost is obviously the biggest barrier and the main reason most couples will not do this.

IVF is expensive and often not covered by insurance which creates access disparities. PGT-P would exacerbate these disparities in access.

Yes, you are correct. I agree that this is a very unfortunate issue. In my view, the solution here is to bring down the cost of IVF, and possibly lobby for government to cover the cost if PGT-P becomes cost-effective enough.

I’ll probably write more about this in the future, but the disparity in IVF prices between US clinics and those abroad suggests that there is a lot of room for improvement without any fundamental technological breakthroughs.

Beyond those simple efficiencies, if someone gets In-vitro gametogenesis working it could eliminate the cost of medications, surgery, and pre-retrieval monitoring which drives the bulk of IVF costs.

Slippery-slope eugenics and discrimination are real ethical concerns that would need to be mitigated.

This is kind of a vague statement so correct me if I misunderstood what you’re trying to say.

In my view, the term “eugenics” should not be used to describe embryo screening. In most people’s minds “eugenics” conjures images of government-sponsored sterilization efforts, genocide, and racist pseudoscience. I understand the technical definition is just “good for genes”, but this is not what comes to mind for most people when they hear this word.

Even worse, most of the horrible things done in the name of “eugenics” in the past were in fact not eugenic at all! The entire Nazi theory of genes was based on a fundamental misunderstanding of how genes worked. They believed that non-aryan peoples were “contaminating” the “pure aryan bloodline”, and that only by purging those who were unpure could they make a perfect master race. Which is of course not just a morally repugnant theory, but also wrong.

If you want to have a productive conversation, I would suggest using the term “epilogenics” to describe non-coercive means of improving genes that are in line with what we expect those affected would want. There are of course still some concerns with epilogenics (increasing inequality for example), but they are decidedly NOT the same concerns that people have about eugenics.

Beyond that, I don't think the discrimination concerns make that much sense. We already discriminate on the basis of ability in many circumstances because not doing so would lead to terrible outcomes; for example, we don't let people become doctors unless they can understand medicine and pass the MCAT. To these extent polygenic embryo selection affects such aptitudes, I think such "discrimination" is appropriate (though I think this type of discrimination should be referred to using terms like "aptitude testing" to clarify meaning).

Now perhaps people will think that children born via embryo screening have some magical abilities that they do not possess, and therefore give them far more benefits than they would have fairly earned. The broader term for this is the "halo effect". I do think this is a concern, but it doesn't strike me as any more persuasive than saying we shouldn't vaccinate people against disease because then they will take inappropriate risks.

In-vitro gametogenesis (IVG) would be a game-changer. The utility of PGT-P would be greatly enhanced if suddenly you had thousands of eggs and hundreds of embryos to select from.

IVG would definitely make embryo selection better, but in my view the main potential benefit is expanding the size of the benefit for older couples and (potentially) reducing costs.

If you go from 10 expected births to 1000, the expected gain only increases by 70%. That’s great, but we’re still limited to selecting the maximum value from a normal distribution. So the increase is smaller than many people believe.

Thanks for the response.

I thought the following article does a good job of outlining some of the limitations of PRS for embryo selection:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9527452/

Like any new (medical) technology, I think that it's important to consider the ethical implications. This doesn't mean that we shouldn't do it or allow it, but just that we should be thoughtful about it.

As for clinic success rates, I didn't mean to imply that "they don't mean anything". It's just that prospective patients should be aware that they can't always be taken at face value. Clinic populations may differ significantly and the data can be manipulated to some extent. The good news though is that as long as a clinic does a significant volume of IVF cycles per year and reports decent success rates, it's probably fine.

I agree that increasing insurance coverage for infertility services would help improve access and reduce disparities.

RE IVG, I see your point. I guess germline/embryo gene-editing (if it were proven safe, efficient, and efficacious) would have greater utility than PGT-P (preimplantation genetic testing for polygenic risk).

The paper you linked outlining the limitations of polygenic embryo screening mostly rests its conclusions on the supposed impossibility of showing that embryo screening actually works.

Again, I refer back to tests of polygenic risk scores in siblings. If predictors work in that population, they should be considered clinically validated. This kind of validation is standard practice in other areas of data science. I'd appreciate it if you or someone else questioning PGT-P could outline exactly why they believe sibling validation of polygenic scores is insufficient evidence to justify its clinical use. It is literally a randomized control trial for genes.

My response to other criticisms in the paper:

"Furthermore, statistical manipulation of genetic data may limit the detection of rare pathogenic gene variants"

De novo mutations are exceedingly rare. The average person has about 70. The expected effect from missing these mutations is so low that it's barely worth considering, especially compared to the expected benefits of simply improving predictors and adding more traits to the selection index used in PGT-P.

not only is it difficult to assess the clinical validity of PRS-ES in terms of the outcomes in question, it is also possible that clinical validity would be limited by the different effects of future environment on gene expression, compared to the past.

Yes, I agree that this is a fair critique of embryo selection, particularly for the diseases of old age. But the obvious solution here is just to apply some time discount factor; weight traits like depressive tendency, obesity, and intelligence more heavily than prostate cancer and heart disease, since the former will have an impact much sooner.

Mathematical modelling of PRS-ES has been attempted and indicated extremely limited utility in terms of non-pathological trait selection (Karavani et al., 2019), such as height and intelligence quotient (IQ).

The Karavani study used predictors that were already outdated by two years when the study was published. They are even more outdated today. Today you could expect +6 IQ points and +3.7 cm using state of the art predictors and the same assumptions made in the Karavani study.

Furthermore the overall expected gain increases as you add traits to your index.

Most of the conditions which can be assessed using PRS have a significant gender association.

Yes, this is why Genomic Prediction adjusts for sex in their index. I assume Orchid does the same thing though I know less about their selection methodology.

As elegantly described by Turley et al. (2021), the purported benefit of PRS-ES is commonly calculated and presented as a difference not between two average embryos, but rather a difference between the highest and the lowest possible risk embryos, thus maximizing the theoretical benefit of the test.

I've looked at the Genomic Prediction report and this is NOT how the results are presented. The expected reductions are calculated using actual siblings, and a baseline is an average person, not between the highest and lowest risk embryo.

I have asked and risk has NEVER been reported in the way described. It's amazing how even in otherwise reputable journals these easily falsifiable rumors are allowed to spread.

It is also important to note that all embryos produced by a couple are genetically related and share on average 50% of SNPs. One must conclude that owing to inherent limitations of the PRS-ES models and limited variation in the genetic makeup of embryos produced by a couple, the clinical utility of PRS-ES is almost certainly diminutively small (Karavani et al., 2019).

Wrong. sibling variation is 1/sqrt(2) times that of the regular population, which is plenty for selection to result in substantial disease risk reduction or trait improvements. See Lello et al for more details

I could go on about this paper, but over and over again I see the authors making unsubstantiated claims contradicted by the evidence. There ARE a few legitimate issues with embryo selection, but the paper focuses most of its energy on non-issues.

I pretty much agree with the rest of your comment, though I think the situation for patients is better than you think. I've spoken with the genetic counselors at both Orchid and Genomic Prediction and found them to be very straightforward about the expected benefits and risks. The real issue with Genomic Prediction is they seem to have taken down their page showing the expected disease reductions from polygenic embryo screening. I don't know why.

if a gene can predict differences between siblings, you can be quite confident that it is in fact CAUSING the difference.

This isn't quite correct, but the conclusion is still the same. What it actually means is that the variant is causal or is tightly linked to (i.e., close to and co-inherited with) the causal variant.

For selection purposes, this information is sufficient. However if you wanted to perform editing to affect the phenotype, you would need to actually confirm causality.

The other information in this post is correct.

Fair point. What I was really trying to say is that in the context of embryo selection, you can be pretty confident that selecting embryos based on that predictor will actually result in more or less of the trait depending on what you're selecting for.

But you're right. I've edited the post to clarify that.

If you go from 10 expected births to 1000, the expected gain only increases by 70%. That’s great, but we’re still limited to selecting the maximum value from a normal distribution. So the increase is smaller than many people believe.

This is because the expected gain of embryo selection using a predictor is roughtly directly proportional to the correlation between the predictor and the trait of interest but only the square-root of the log of the number of embryos! Having more embryos to choose from is great, but the best embryo of 1000 or even 10 isn't likely to be chosen for implantation if your predictor isn't very good.

https://doi.org/10.1016/j.cell.2019.10.033

Having more embryos to choose from is great, but the best embryo of 1000 or even 10 isn't likely to be chosen for implantation if your predictor isn't very good.

Choosing 'the best' is irrelevant and a distraction in most contexts. It is not the case that you will be 'likely' to choose 'the best' if you have a 'good' predictor - because for any 'good' predictor, no matter how good it is, the probability of 'choosing the best' still becomes arbitrarily close to zero as n increases (specifically, it goes to 1/n). Nor does it particularly matter if you do select the #1, by chance, because you still have high odds of not yielding a downstream success like a live birth.

(Of course, the expected gain - which is what matters - just keeps going up and up with n...)

Thanks for the response.

I thought the following article does a good job of outlining some of the limitations of PRS for embryo selection:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9527452/

I agree that increasing insurance coverage for infertility services would help improve access and reduce disparities.

The paper you linked outlining the limitations of polygenic embryo screening mostly rests its conclusions on the supposed impossibility of showing that embryo screening actually works.

My response to other criticisms in the paper:

"Furthermore, statistical manipulation of genetic data may limit the detection of rare pathogenic gene variants"

not only is it difficult to assess the clinical validity of PRS-ES in terms of the outcomes in question, it is also possible that clinical validity would be limited by the different effects of future environment on gene expression, compared to the past.

Mathematical modelling of PRS-ES has been attempted and indicated extremely limited utility in terms of non-pathological trait selection (Karavani et al., 2019), such as height and intelligence quotient (IQ).

Furthermore the overall expected gain increases as you add traits to your index.

Most of the conditions which can be assessed using PRS have a significant gender association.

Yes, this is why Genomic Prediction adjusts for sex in their index. I assume Orchid does the same thing though I know less about their selection methodology.

As elegantly described by Turley et al. (2021), the purported benefit of PRS-ES is commonly calculated and presented as a difference not between two average embryos, but rather a difference between the highest and the lowest possible risk embryos, thus maximizing the theoretical benefit of the test.

I have asked and risk has NEVER been reported in the way described. It's amazing how even in otherwise reputable journals these easily falsifiable rumors are allowed to spread.

It is also important to note that all embryos produced by a couple are genetically related and share on average 50% of SNPs. One must conclude that owing to inherent limitations of the PRS-ES models and limited variation in the genetic makeup of embryos produced by a couple, the clinical utility of PRS-ES is almost certainly diminutively small (Karavani et al., 2019).

if a gene can predict differences between siblings, you can be quite confident that it is in fact CAUSING the difference.

This isn't quite correct, but the conclusion is still the same. What it actually means is that the variant is causal or is tightly linked to (i.e., close to and co-inherited with) the causal variant.

For selection purposes, this information is sufficient. However if you wanted to perform editing to affect the phenotype, you would need to actually confirm causality.

The other information in this post is correct.

But you're right. I've edited the post to clarify that.

If you go from 10 expected births to 1000, the expected gain only increases by 70%. That’s great, but we’re still limited to selecting the maximum value from a normal distribution. So the increase is smaller than many people believe.

https://doi.org/10.1016/j.cell.2019.10.033

Having more embryos to choose from is great, but the best embryo of 1000 or even 10 isn't likely to be chosen for implantation if your predictor isn't very good.

(Of course, the expected gain - which is what matters - just keeps going up and up with n...)

LESSWRONG
LW

LESSWRONG
LW

59

Reply to a fertility doctor concerning polygenic embryo screening

59

PGT-P has not been validated yet, which may take decades to do, if ever.

The science in terms of GWAS isn't quite there yet IMHO - we don't know all the genes that are important for most traits and we may be inadvertently selecting against some desirable traits, for example.

Comparing clinic success rates using CDC data is imperfect because of different patient characteristics, patient selection, and reporting bias.

IVF pregnancies carry a significantly higher complication rate (hypertensive disorders, preterm birth, placental abnormalities, etc.) compared to spontaneous pregnancies - unclear if this is due to IVF or underlying infertility diagnosis.

The risk-benefit calculus of PGT-P is going to be different for a couple who already needs to do IVF anyway to have a baby (low additional risk/cost) compared to a couple doing IVF just so that they can do PGT-P (higher additional risk/cost).

IVF is notoriously inefficient at present. Depending on female partner age, each cycle may yield only very few embryos making the benefit and utility of PGT-P limited. It may not be practical, safe, or financially feasible to do multiple cycles of IVF to increase the cohort of transferable embryos.

IVF is expensive and often not covered by insurance which creates access disparities. PGT-P would exacerbate these disparities in access.

Slippery-slope eugenics and discrimination are real ethical concerns that would need to be mitigated.

In-vitro gametogenesis (IVG) would be a game-changer. The utility of PGT-P would be greatly enhanced if suddenly you had thousands of eggs and hundreds of embryos to select from.

59

"Furthermore, statistical manipulation of genetic data may limit the detection of rare pathogenic gene variants"

not only is it difficult to assess the clinical validity of PRS-ES in terms of the outcomes in question, it is also possible that clinical validity would be limited by the different effects of future environment on gene expression, compared to the past.

Mathematical modelling of PRS-ES has been attempted and indicated extremely limited utility in terms of non-pathological trait selection (Karavani et al., 2019), such as height and intelligence quotient (IQ).

Most of the conditions which can be assessed using PRS have a significant gender association.

As elegantly described by Turley et al. (2021), the purported benefit of PRS-ES is commonly calculated and presented as a difference not between two average embryos, but rather a difference between the highest and the lowest possible risk embryos, thus maximizing the theoretical benefit of the test.

59

"Furthermore, statistical manipulation of genetic data may limit the detection of rare pathogenic gene variants"

not only is it difficult to assess the clinical validity of PRS-ES in terms of the outcomes in question, it is also possible that clinical validity would be limited by the different effects of future environment on gene expression, compared to the past.

Mathematical modelling of PRS-ES has been attempted and indicated extremely limited utility in terms of non-pathological trait selection (Karavani et al., 2019), such as height and intelligence quotient (IQ).

Most of the conditions which can be assessed using PRS have a significant gender association.

As elegantly described by Turley et al. (2021), the purported benefit of PRS-ES is commonly calculated and presented as a difference not between two average embryos, but rather a difference between the highest and the lowest possible risk embryos, thus maximizing the theoretical benefit of the test.