"A Whole-Cell Computational Model Predicts Phenotype from Genotype" by Jonathan Karr et al.

This paper appeared a few days ago in Cell, and describes a computational simulation of the bacterium Mycoplasma genitalium, conducted at this lab. The paper is behind a paywall, but is blogged about here. The simulation software is freely available from the project web site.

From the abstract: "Here, we present a ‘‘whole-cell’’ model of the bacterium Mycoplasma genitalium, a human urogenital parasite whose genome contains 525 genes. Our model attempts to: (1) describe the life cycle of a single cell from the level of individual molecules and their interactions; (2) account for the specific function of every annotated gene product; and (3) accurately predict a wide range of observable cellular behaviors."

According to an editorial commentary in the same issue, this is the first simulation of a complete free-living microbe.

New Comment
17 comments, sorted by Click to highlight new comments since:
[-]gwern190

Fulltext

It's interesting that this high-level modeling of a single cell runs at near realtime on a single core, while being written in Matlab.

You beat me to it! Have the full text in some other places anyway. (1), (2), (3).

If you can get that kind of performance out of Matlab, then you should be able to simulate every cell of C. elegans in real time with < 1,000 cores with a rewrite in C. Some researchers at the University of Glasgow already have a working 1,000 core processor (using FPGAs). I'm gonna to have to do some updating on this.

The metabolism, anyway. This is a high-level model of the metabolism of a simple bacteria, and I'm not sure how close one could consider it to a neuron which was part of a functioning neural network, for example.

I believe you, but where does it say that in the paper? They mention it running on a 128 core cluster, but my brief skimming missed them saying anything about run times.

C-f for '128'; the cluster was being used to simulate 128+ cells by my reading, so that's a core or less per cell.

I am amazed by the current pace of the research into formalizing and simulating biology. Here is another fresh example, a borg-like jellyfish.

So closely does the medusoid mimic the movement of the real organism that it even creates vortices in the water like the ones living jellyfish create to waft food into their mouths.

The medusoid took four years to build, and the scientists have already begun working on another more complex artificial marine creature. "The jellyfish is really simple, and we're going to do one that's a bit harder, and then a bit harder still, and so on, with our long-term goal to build a heart," said Parker.

How much harder would it be to simulate various human brain cells?

Mycoplasma genitalium has less than 600 genes. We have something like 30,000. So a ballpark answer might be "at least 50 times harder". I expect it would be very much more than that, as a free-living microbe has much simpler interactions with everything around it, while a neuron can have connections to thousands of other neurons. Neurons are also much bigger, with more physically complex stuff.

Thinking in terms of uploads, it might not be necessary to simulate all that in order to duplicate whatever is important about its function. If you don't know what is important about its function, then you may have to brute-force it at the highest level of detail you can manage, at least until you discover what is important.

ETA: Also, neurons are faster. The time step of their simulation was 1 second. For neurons transmitting electrical signals, you'd need somewhere below 1 millisecond resolution. So there's another factor of at least 1000.

[-]Cyan20

So a ballpark answer might be "at least 50 times harder".

The "at least" part seems wrong to me. Cellular differentiation works by deactivating some genes more-or-less permanently and by sequestering deactivated genes in densely packed regions of chromatin that are inaccessible to transcription complexes. (This is a one-sentence summary of an absurdly complex biological process. You have been warned.) Understanding the functional molecular biology of a highly differentiated cell type like a neuron won't require the understanding of 30K interacting genes.

Good point. Is anything known about what proportion of genes might be turned off in a differentiated cell?

[-]Cyan10

Lots, but not by me at this time.

I don't know if this is at all accurate, but I might expect genes to add complexity non-linearly; like each new gene gives four new possibilities, so 50 times as many genes would make the simulation up to 4^50 times as hard.

I don't think that works. It would have Mycoplasma genitalium's 525 genes making it 4^525 times as hard to simulate as water.

I agree that when I think about the number 4^525 I don't think it is reasonable for describing anything ever.

In order to understand the nervous system, it is necessary to know the synaptic connections between the neurons, yet to date, only the wiring diagram of the adult hermaphrodite of the nematode Caenorhabditis elegans has been determined. Here, we present the wiring diagram of the posterior nervous system of the C. elegans adult male, reconstructed from serial electron micrograph sections. This region of the male nervous system contains the sexually dimorphic circuits for mating. The synaptic connections, both chemical and gap junctional, form a neural network with four striking features: multiple, parallel, short synaptic pathways directly connecting sensory neurons to end organs; recurrent and reciprocal connectivity among sensory neurons; modular substructure; and interneurons acting in feedforward loops. These features help to explain how the network robustly and rapidly selects and executes the steps of a behavioral program on the basis of the inputs from multiple sensory neurons.

"The Connectome of a Decision-Making Neural Network"; fulltext

[-]Clippy-30

This seems like a really roundabout way to research manufacturing processes. There are much simpler factory designs than a biological cell, which have a higher efficiency (as measured by useful output per useful unit input). Those are what should be modeled, researched, and optimized, not these labyrinthine mechanisms.