A profile in courage: On DNA computation and escaping a local maximum

Recently, a paper in Nature on building neural networks with DNA-based switches caught my attention.^[1] The authors, Kevin Cherry and Lulu Qian, developed an interesting system that uses DNA base pairing to implement a trainable neural network. However, for me, what was even more interesting was a story hidden in the Methods section and Supplementary Information.

How can DNA learn?

As I discussed on this blog back in 2023, DNA base pairing can be used to perform computations. Back then, I was writing about relatively simple computational systems, for example ones that could add two 6-bit numbers. These were built of logic gates, like an AND gate that releases a piece of single stranded DNA only when two other “input” strands are present.

The current paper goes far and beyond this. The authors came up with a clever system of making DNA logic gates that don’t interfere with each other. They scaled this system up to 100 input bits (implemented in about 1200 unique pieces of DNA),^[2] then showed that it could classify handwritten numerals from the popular MNIST dataset when they were presented as a 10x10 bit “image”. Most impressively, this classification was not hard-coded; instead, their chemical system actually learned the classification based on exposure to inputs. And once “learned”, the memory of patterns was stable for at least several days (or months if the DNA was frozen).

The basic structure of the DNA logic gates is based on a “toehold switch” that uses a strand of DNA to displace another strand of DNA based on sequence hybridization.

Toehold switch diagram (CC-BY). The green strand displaces the blue strand from the red strand due to higher binding affinity.

The current paper chained hundreds of these switches together, building on previous work but also inventing two new kinds of DNA logic gates (called an “activatable amplification gate” and “activatable transformation gate”). These activatable gates require a threshold concentration of an “activator” DNA strand before turning on, thus implementing the kind of nonlinear effects required for neural network based classification.

I am not sure how many actual applications this will have, since there’s no easy way to integrate this kind of short single stranded DNA with living systems (which generally operate with different kinds of signaling). And for everything else, electronics are vastly superior. Notably, the classification accuracy of this system was relatively low, just 53 to 81 percent (depending on the digit). Worse, the system can’t be reset after use, which means it can only be used once. This also means it doesn’t allow backpropagation, so training is very limited. Still, this system is an interesting proof of concept for information processing using chemical binding events.

How did the authors learn from DNA?

The more interesting thing about this paper is the story behind it. Building such a complicated system of DNA logic gates was not easy! As any veteran Factorio player knows, what starts as a reasonably simple and logical system can easily become a mess of spaghetti as more and more features are added. In this case, the DNA strands might end up looking a bit like literal spaghetti!

The authors started out on this project with an initial logic gate design, but kept running into problems. Every time they fixed one, another appeared. By the third year of research, they were probably starting to go a bit crazy. The situation was described candidly in the Supplementary Information:

We learned an important lesson from extensive rounds of design revisions discussed above (Supplementary Notes 4.1 to 4.8): Identifying challenges one by one and coming up with solutions for each challenge may lead to limited success. For complex molecular systems, a solution for one problem could give rise to another problem somewhere else in the system. This phenomenon could further cascade, and in the worst case scenario, forming a deadlock in a cycle.

At this point, they took a step back and identified seven interrelated problems with their design, any one of which might be solved on its own, but at the cost of making the others worse. So in the third year of research, they realized they had to throw everything out and start over.

Moving to Design 2 was a major decision – it meant that we must throw out three years of work on Design 1 and start over from scratch. This decision eventually led to a successful demonstration of learning in DNA-based neural networks. Beyond the science presented in this paper, we cannot over emphasize the philosophy that all challenges must be considered as a whole and that the most extraordinary courage is needed at the most desperate time.

With the benefit of three years of experience, they came up with a new design with extra features^[3] to resolve these problems in the first design. Fortunately, this ended up working a lot better.

Comparison of the structures of Design 1 and Design 2 logic gates (from the Supplementary Information)

The authors summarized their lessons as follows:

Two important lessons that we learned for engineering complex molecular systems are as follows. First, a failure mode of the debugging strategy is to focus on individual challenges. A solution for one problem may give rise to another problem somewhere else in the system. With further cascading, in the worst-case scenario, this debugging strategy may form a deadlock in a cycle. After understanding the failure mode, we arrived at an alternative strategy where all challenges are considered as a whole and solutions are devised to address the entire body of challenges simultaneously (Supplementary Note 4.9). Second, a waste of energy may occur if there is no approach to differentiate fabrication problems from design problems. For example, we discovered severe and uneven sample evaporation in source plates for a liquid handler, resulting in wildly inaccurate concentrations that directly affect the computation of the molecular system. Instead of just relying on a better sample storage method, we established a systematic approach to regularly evaluate the sample quality and reorder new strands whenever needed (Supplementary Note 5.13).

Besides the lesson about getting stuck trying to optimize an inadequate design, the authors also mentioned the importance of distinguishing design issues from execution issues. Often, execution issues can mask design issues, so ensuring accurate execution is critical for evaluating different designs. On the other hand, a design that is perfect in theory may be extremely difficult to actually execute.

What can we learn from this story?

A common failure mode in research, especially biology, is to get stuck in a local optimum. I’ve personally experienced this several times myself: I have a protocol that works a little bit, and I spend a few months trying to make it better while not realizing that it has fundamental limitations that make it unsuitable for what I’m trying to do. I’m just glad I haven’t spent three years this way!

The main lesson from the paper is: learn to recognize when you’re stuck, and have the courage to backtrack and start again. This can be quite difficult when you’ve invested a lot of time and effort. Your research can seem like “just one more experiment” away from finally working, but if you’ve been at that spot for months, perhaps it’s time to try a completely new approach instead of a variation on your existing one. Consider what experiment could address the fundamental problems as a whole, rather than trying to play whack-a-mole with individual bugs.

This requires both humility to admit your current approach might be wrong, and courage to abandon what seems safe and try something new. Of course, you shouldn’t give up too early either! As my friend Devon wrote, knowing exactly when to give up is one of the most important traits of a good scientist. But if you have a suspicion things aren’t going well with your project, it’s worth taking the time to consider the problems as a whole, and ask whether your current approach is really the best one.

^{^}
Thanks to Eryney Marrogi for sharing this with me! Notably for a Nature paper it has just two authors.
^{^}
The sequences are all listed in the supporting information, but this is in PDF format, and it’s not trivial for me to count the exact number that were used in the final system. I might update this article if I get that information.
^{^}
Like a bulge, and additional hybridization sites. They also removed the need for a separate “drain” sequence that had been causing trouble. The details of how this works are in the supplementary information if you’re interested.

LESSWRONG
LW

LESSWRONG
LW

42

A profile in courage: On DNA computation and escaping a local maximum

42

42

How can DNA learn?

How did the authors learn from DNA?

What can we learn from this story?