cancer neoantigens

For cells to become cancerous, they must have mutations that cause uncontrolled replication and mutations that prevent that uncontrolled replication from causing apoptosis. Because cancer requires several mutations, it often begins with damage to mutation-preventing mechanisms. As such, cancers often have many mutations not required for their growth, which often cause changes to structure of some surface proteins.

The modified surface proteins of cancer cells are called "neoantigens". An approach to cancer treatment that's currently being researched is to identify some specific neoantigens of a patient's cancer, and create a personalized vaccine to cause their immune system to recognize them. Such vaccines would use either mRNA or synthetic long peptides. The steps required are as follows:

  1. The cancer must develop neoantigens that are sufficiently distinct from human surface proteins and consistent across the cancer.
  2. Cancer cells must be isolated and have their surface proteins characterized.
  3. A surface protein must be found that the immune system can recognize well without (much) cross-reactivity to normal human proteins.
  4. A vaccine that contains that neoantigen or its RNA sequence must be produced.

Most drugs are mass-produced, but with cancer vaccines that target neoantigens, all those steps must be done for every patient, which is expensive.

protein characterization

The current methods for (2) are DNA sequencing and mass spectrometry.


DNA sequencing is now good enough to sequence the full genome of cancer cells. That sequence can be compared to the DNA of normal cells, and some algorithms can be used to find differences that correspond to mutant proteins. However, guessing how DNA will be transcribed, how proteins will be modified, and which proteins will be displayed on the surface is difficult.

Practical nanopore sequencing has been a long time coming, but it's recently become a good option for sequencing cancer cell DNA.

MHC mass spec

Proteins are often bound to a MHC for presentation on the surface, and those complexes can be isolated by mass spectrometry. You then know that the attached proteins can be on the cell surface. However...

  • It's currently hard to guess which of those MHC-bound proteins could have a good immune response.
  • This requires more cells than sequencing.
  • This doesn't find all the mutant surface proteins.
  • Peptide sequencing is necessary, and it's not easy.

comments on AlphaFold

I've seen a lot of comments on AlphaFold by people who don't really understand how it works or what it can do, so I thought I'd explain that.

AlphaFold (and similar systems) input the amino acid sequence of a protein to a neural network, using a typical Transformer design. That NN predicts relative positions of atoms, which is possible because:

  • Some sequences form common types of local structures, and relative positions within those structures can be predicted.
  • Some distant pairs of sequences tend to bind to each other.
  • AlphaFold training included evolutionary history, and multiple mutations that happen at the same time tend to be near each other.

The positions predicted by the neural network are not used directly; they're an initial guess for a protein force field model. What neural networks provide is a better initialization than previous approaches.

The above points indicate some limitations that AlphaFold-type approaches have, such as:

  • They're not as good for prions or otherwise "unnatural" proteins.
  • They don't predict protein functions from structure, or vice-versa.
  • They're not as good when evolutionary history isn't available.

While this approach is more limited than some people seem to think, it's still effective enough that, if a surface protein can be sequenced, its structure can probably be determined well enough to design affimers for it.


Cryo-EM is relatively new, it's one of the most powerful techniques for protein characterization, and it's produced many interesting results such as the structure of bacterial flagellal motors, so I feel practically obligated to mention it at every opportunity.

Cryo-EM can produce structures from small crystals. If a protein can be isolated, "single particle cryo-EM" can even produce structures without crystallizing them at all. Still, it's currently easier to determine protein sequences with mass spectrometry, and I think nanopore approaches have more chance of reducing costs for this application.

nanopore protein analysis

The same basic approach used for current nanopore DNA sequencing can be used to detect protein post-translational modifications.

Because such nanopore sequencing detects changes in ion flow through the nanopore, it's obviously better at detecting something like phosphorylation or glycosylation than the (smaller) differences between amino acids. But it should be fairly good at detecting charged groups - which does provide some data about protein sequences that could be combined with mass spec data.

monoclonal antibodies

Rather than inducing production of antibodies that target cancer neoantigens, it's also possible to produce those antibodies directly and inject them.

There are already monoclonal antibody treatments for cancer, such as Nivolumab, but they're not individualized. Surface proteins that are common in cancers across different people are normal human receptors that are overexpressed by the cancer cells, not neoantigens that don't occur in normal cells. So, there are serious side effects to drugs targeting them.

Monoclonal antibodies treatments are expensive, but individualized treatments targeting cancer-specific proteins would be much more expensive.

Wikipedia also has a decent page on cancer immunotherapy in general.


Instead of creating antibodies that bind to a target and directly signal immune cells with the crystallizable region, it's possible to create smaller proteins that bind to a target and expose a native antigen that natural antibodies bind to. In other words, the cancer neoantigens (that don't trigger an immune response) would get covered by something that does trigger an immune response, producing a similar effect with a smaller protein. On the other hand, such aptamers could also get bound to antibodies and captured by immune cells before they bind to cancer cells.


Instead of using synthetic proteins that bind to neoantigens, it's sometimes possible to use synthetic DNA sequences instead; those are sometimes called aptamers. DNA has less versatility than proteins in terms of the structures it can form and bind to, but DNA sequences can be easily amplified by PCR, which could make them cheaper to produce than proteins.

other cancer treatments

To evaluate the merits of research on a cancer treatment approach, we have to briefly consider how it compares to other promising approaches.

replication disruptors

Cancer treatments involve targeting some difference between cancer cells and normal cells. The most obvious such difference is that cancer cells replicate more, and the most common cancer treatments (besides surgery) target that. Mitosis is a complex process that can be disrupted in many ways; a notable example is cisplatin.

Obviously, cell replication is normally important, and even if replicating cells could be targeted without any effect on other cells, disrupting it results in serious side effects. This puts some limits on how good this approach could theoretically be, but currently the bigger problem is that cancers tend to find a way to replicate anyway.

mitochondria-mediated apoptosis

Normal cells have several safeguards that cause apoptosis before they'd become cancerous. One of the most important is mitochondria-mediated apoptosis, so cancer cells often disrupt normal mitochondria function. Targeting this difference is the basis of most of my own thoughts on potential cancer treatments.

There are 2 basic approaches to this: reactivating mitochondria-mediated apoptosis, and disrupting mitochondria-independent metabolism to prevent cancer ATP generation. I consider both approaches worth pursuing, but details are beyond the scope of this post.

vaccine production

DNA can be amplified by PCR, but RNA amplification is somewhat more complex; mRNA for vaccines is currently produced with "in vitro transcription".

Directly synthesizing polypeptides (using native chemical ligation) isn't harder than directly synthesizing mRNA. If direct synthesis is used, synthetic long peptides seem better to me than mRNA, because the immune response works somewhat better, but details are beyond the scope of this post.

The immune system often recognizes non-human proteins; the mRNA vaccines for COVID don't need adjuvants because the COVID spike protein is recognized as foreign. However, if cancer neoantigens provoke an immune response, the immune system kills those cells, so remaining cancer neoantigens wouldn't be recognized on their own. This also means that cancers are strongly selected to have fewer and more human-like neoantigens, which makes it harder to produce vaccines for them, and makes cross-reactivity with normal surface proteins more likely. Also, cancers can mutate ("tumor antigen loss") such that they stop producing some surface proteins.

Synthesized cancer neoantigens can be directly attached to a native antigen, and when the immune system recognizes the native antigen it will produce antibodies for the neoantigen part. With mRNA vaccines, either a native antigen would be added to the sequence (in a way that doesn't interfere with the neoantigen structure) or adjuvants would be added. Typically, adjuvants kill some cells, causing release of human double-stranded DNA fragments that indicate to immune cells that something killed some cells nearby, and triggering production of antibodies to everything foreign in the area. But that obviously only works if the neoantigens are recognizable enough that an indication that "something bad is in this area" is sufficient.


Individualized cancer vaccines are not yet practical, but I consider them a promising possibility for significantly better cancer treatments. I think research on that should prioritize:

  • combining mass spectrometry and nanopore data for protein characterization
  • continued development of nanopore sequencing
  • continued surveying of cancer genomes, such as TCGA
  • developing lower-cost methods for isolation of cell surface proteins
  • developing equipment and methods for lower-cost production of long polypeptides
New Comment
19 comments, sorted by Click to highlight new comments since:

Useful post. I can expand on one point and make a minor correction. Single Particle Cryo-EM is indeed a new(ish) powerful method of protein structure elucidation starting to make an impact in drug design. It is especially useful when a protein cannot easily be crystallised to allow more straightforward X-Ray structure determination. This is usually the case with transmembrane proteins for example. However it is actually best if the protein molecules are completely unaligned in any preferred direction as the simplest application of the refinement software assumes a perfectly random 3D orientation of the many thousands of protein copies imaged on the grid. In practice this is not so easy to achieve and corrections for unwanted preferred orientation need to be made.

That's true; I misremembered that part when I wrote it. I'll just remove that.

When you say it's not yet practical, are we missing some key steps, or could it be done at high enough cost with current technology but can't scale?

I imagine a startup which cured rich people's cancers on a case by case basis would have a lot of customers, which would help drive prices down as the technology improved.

There are a few issues with that.

  1. The cost would probably be a significant fraction of the development of a new monoclonal antibody treatment, making this currently probably limited to billionaires.

  2. Personalized drug development on that scale isn't something that can simply be purchased, and if billionaires tried, governments would probably block them, because voters would consider that unfair, and because cancer researchers can't simply be increased in proportion to budgets. There are only so many people with the relevant skills and inclinations.

  3. Better methods for development wouldn't just reduce costs, but would also reduce time taken. There would need to be a billionaire, diagnosed with cancer without a good treatment, who would clearly die from it, but not for a couple years.

  4. Better methods and understanding wouldn't just reduce costs and time, they'd also reduce risks. Targeting a receptor that's actually important normally but wasn't fully understood could kill someone before the cancer.

Makes sense thanks!

I imagine a startup of this ilk could be based in Prospera, which wouldn't be a problem for the wealthy few to travel there for personalised treatment.

I also imagine that with a lighter regulatory regime, no need to scale up production, and no need for lengthy trials, developing a monoclonal antibody would be much quicker and cheaper. Consider how quickly COVID vaccines were found compared to when they were ready for use.

The other hurdles sound significant though.


It doesn't have to be made on-demand for a specific person. It can be off-the-shelf precision medicine. Say there are 20 different antigens targeted. A company can produce those vaccines for 20 types, then the doctor can order the correct one(s) based on the tumor's genetic profile. This is how precision medicine is expected to be done. In some ways, it's already done in various areas, like blood types.

Promoted to curated: Cancer vaccines are cool. I didn't quite realize how cool they were before this post, and this post is a quite accessible intro into them. 

The cancer must develop neoantigens that are sufficiently distinct from human surface proteins and consistent across the cancer.

This is not necessarily true. There are some proteins that get produced by embryos and not by adult humans. Sometimes cancer mutates in a way that those proteins get produced by cancer cells.

While the vaccines that target a single of those embryo proteins did not do enough in clinical trials, I don't see a reason why we should completely ignore those proteins. 

Cancer cells must be isolated and have their surface proteins characterized.

Given that cancer cells engage in necrososis much more than regular cells you don't need to isolate cancer cells to get the DNA of cancer. ctDNA can in theory be used to sequence the cancer. Using ctDNA might even be better because it gives you a better idea of whether a mutation is present in most of the cancer cells or only the section from which you removed cells. 

Individualized cancer vaccines are not yet practical

Moderna has just put an individualized cancer vaccine into a phase III trial after positive results from a phase 2b trial:

Hirawat was referring to Moderna and Merck’s December announcement of positive results from a 157-patient phase 2b trial dubbed KEYNOTE-942. The pair said a combination of their personalized mRNA cancer vaccine, coded mRNA-4157 or V940, and the PD-1 inhibitor Keytruda slashed the risk of tumor recurrence or death by 44% compared with Keytruda alone when used as an adjuvant therapy in stage 3/4 melanoma following complete surgical resection.

Do you think that trial is a bad idea?

ctDNA can in theory be used to sequence the cancer.

I don't think that's a good idea. It's not the same and would be hard to separate from other cfDNA.

Moderna has just put an individualized cancer vaccine into a phase III Do you think that trial is a bad idea?

What Moderna is doing is sequencing cancer cells and healthy cells, and using some algorithm to guess what mRNA vaccine would work. I think they're not quite there yet: the hazard ratio with a checkpoint inhibitor isn't much better than the checkpoint inhibitor alone, and I always discount the reported performance in small trials a bit. (And looking at the stock price, Wall Street seems to agree.) Note also that it's specifically for certain types of melanoma, and comes after several failed cancer mRNA vaccine trials. That limitation to melanoma types indicates to me that their algorithm and its personalization are probably rather limited.

It's not that I'm opposed to Moderna doing their trial per se, but I am a bit concerned that their patents could ultimately result in a net reduction in progress.


Following up on the current market state discussion related to Moderna, any thoughts on the Amgen treatment that FDA just approved today? Seems to be a much more targeted treatment but the general approach of targeting specific mutations seems to suggest a "family of drugs" that targets a number of different mutations. If that can cover mutations the cause 90% of cancers seems like it would be a huge win. (But I'm not sure if things work that way!)

That new Amgen drug targets a human protein that's mostly only used during embryonic development. I think it's expressed by most cancer cells in maybe around 0.2% of cancer cases. In many of those cases, some of the cancer cells will stop producing it.

Most potential targets have worse side effects and/or are less common.

Interestingly, many cancer "neoantigens" (for example, MAGEB1) are also expressed in meiotic cells in the testis. This is because they're usually epigenetically suppressed in healthy tissues, but cancer cells have messed up epigenomes. See:

Also, I would disagree that synthesizing long polypeptides is easier than synthesizing long mRNAs. With polypeptides you have 20 amino acids to work with, and some require special treatment (protecting groups on sidechains). With mRNAs the chemistry is much simpler.

Interestingly, many cancer "neoantigens" (for example, MAGEB1) are also expressed in meiotic cells in the testis. This is because they're usually epigenetically suppressed in healthy tissues, but cancer cells have messed up epigenomes.

That's very true.

I would disagree that synthesizing long polypeptides is easier than synthesizing long mRNAs

That's not what I said: I said it's not harder and seems better. I'm aware of the chemistry involved and stand by that. Contrary to your implication, oligonucleotide synthesis also requires protecting groups.

Question: would it be possible to use retroviruses to target cancer cells selectively to insert a gene that expresses a target protein, and then do monoclonal antibody treatment on that? Would the cancer accelerated metabolism make this any good?

Not an expert here, but it seems to me that if you can make a virus that preferentially infects cancer cells you might as well make the virus kill the infected cancer cells directly.

Fair, depends how hard it is to do that though, I assumed inserting a target gene would be easier than triggering death in a cell that has probably hopelessly broken its apoptosis mechanism.

What's the difference between a virus that preferentially infects cancer cells and a virus that kills infected cancer cells directly?

  • Does "preferentially" mean that the virus also attacks non-cancer cells? Or does it mean that it just doesn't hit cancer cells as hard?
  • "A virus that kills infected cancer cells": does this mean the virus kills cells infected with the virus mentioned in the first part of the question or is this just badly phrased?

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?


Shifting the focus here a little. One might think of cancers as a disease that needs a cure/vacination or think of it as a symptom of aging and cellular process breakdown/wearout. If you agree with the later do you see the potential of these cancer vacines as a first step in the whole ending aging damage and thus a step on the path for extending healthspan (if not really lifespans)?