We have [relatively] recently scanned the whole fruit fly brain, simulated it, confirmed it is pretty highly constrained by morphology alone. Other groups have been working on optical techniques and genetic work to make the scanning process faster and simulations more accurate.
Fruit Flies When You’re Having Fun
The Seung Lab famously mapped the fruit fly connectome using serial section electron microscopy. What is underappreciated is that another group used this information to create a whole brain emulation of the fruit fly. Now, it used leaky integrate and fire neurons and did not model the body of the fly, but it is still a huge technical achievement. The first author has gone off to work at Eon Systems, which is very explicitly aimed at human whole brain emulation.
They did some cool things in the simulation. One is that they shuffled the synaptic weights to see how much that changed the neural activity. Turns out, quite a bit. This is a good thing because it means they’re probably right about how synaptic weight manifests in morphology.
Although modelling using the correct connectome results in robust activation of MN9 in 100% of simulations when sugar-sensing neurons are activated at 100 Hz, only 1 of 100 shuffled simulations did (Supplementary Table 1d). Therefore, the predictive accuracy of our computational model depends on the actual connectivity weights of the fly connectome.
I would recommend reading the whole paper. I think I would do it a disservice by giving an intermediate level of detail in a summary. They just got mind-blowingly good results for such a simple model and it really gives me hope that the actual simulation aspect is a much more tractable problem than I once thought[1].
Connectome Tracing Now And The Near Future
The two biggest issues with connectome tracing right now are speed and accuracy. It takes a long time to image all the samples and is very costly to parallelize the process because electron microscopes are expensive. As for accuracy, it seems like it would be unreasonable to ask for more resolution than an electron microscope offers. This is true but because everything is grayscale segmentation becomes hugely challenging. One of the biggest bottlenecks in the pipeline is human proofreading of the data. We have a good algorithm for this, but it does require a substantial human effort after the first pass. The whole fruit fly brain took ~33 years of human proofreading to complete. Accuracy stays around 90% in the most optimistic case without human involvement. A naïve extrapolation from the fly → mouse brain time would be ~10,000 years of proofreading which is suboptimal. Additionally, many of the proofreaders were trained in neuroanatomy which would further increase the difficulty of using human workers for this process. So yeah, I really want people to work on this problem it seems very important to me.
I am of the opinion that electron microscopy is not the way forward because of these factors and others that will be discussed later. Still, it is the only proven technology and there may be a place to do a hybrid approach with optical providing some information using traditional stains and electron microscopes providing the highest possible resolution.
There are also issues of sample preparation and the exact kind of electron microscope you use. Samples must be sectioned very thin in the axial direction as scanning microscopes can’t see subsurface detail and transmission microscopes have limited penetration depth. If the samples are cut mechanically they generally have artifacts from that which make segmentation across the boundary more challenging. Samples can be destroyed with ion milling or treated such that they photodegrade, this leaves a much cleaner surface for the next imaged section but destroying the samples makes multiple imaging steps challenging.
For much, much more detail I recommend reading this projection for what would it would take to image a whole mouse brain.
Multimodal Data Analysis
There are two big obvious limitations with the fruit fly simulation. The first is that it does not even attempt to model the rest of the fly’s body. I’m comfortable with this, people have been trying to simulate C. elegans for decades now and they still don’t have a complete biophysical model. This is a big challenge, but not my chief interest. The second limitation is in their cell model itself. They used a leaky integrate and fire model that was identical for each neuron. I understand why they did this and I don’t think they actually could have done much better with the data they had, but they also openly admit this is a limitation. Well, there is some recent progress that addresses this gap.
Neurons are inhomogeneous in many ways. One is electrical activity, two neurons will spike differently when given the same current stimulus. Another is gene expression. There are a lot of genes that are known to govern ion channels which determine the electrical activity in the neuron. It is a very natural to ask whether or not you can predict the electrophysiology properties of the neurons based on the gene expression. Well, one recent paper sets out to answer just that. I would say that the big conclusion relevant to this is
… that the variation in gene expression levels can be used to predict electrophysiology accurately on a family level but less so on the cell-type level.
Despite this, I am still confident about this technique being viable for generating models of individual neurons. Why is that? Because the technique they used to measure gene expression is known to be inaccurate. Other methods of measuring gene expression, I am a proponent of MERFISH[2], are comparable or perhaps even better. In the event that these techniques remain inaccurate or are insufficient by themselves, it seems likely that traditional antibody staining could allow for direct measurement of ion channel density[3].
I also want to make it very clear that I have an extreme admiration for the work that they did. I personally tried using some of the same data to achieve the gene expression to biophysical model transformation and can attest to the fact that it is quite challenging. Their paper has a lot of stuff in it I wish I tried, and is quite readable in my opinion. Specifically, I applaud them for trying to fit a relatively simple model. One of my biggest frustrations when I read neuroscience papers are people trying to answer questions they clearly didn’t lay the groundwork for.
Now, even once that is achieved we will still be missing some important factors like how hormones or peptides influence the activity of a neuron. But this is a step in the right direction. Knowing the connectome with weights gets you a long way, making specific cell models gets you closer, knowing the effects of hormones, peptides, blood flow, whatever glial cells are doing, etc. matters but might have a collectively smaller effect than the first two factors. I am not sure how confident I am in that statement, biology is an bottomless well of complexity and some of those higher order effects could be much more important than I appreciate. But all this is really just dancing around my main opinion, which I endorse quite strongly, that we need to have a model more specific than a single template leaky integrate and fire neuron for most of the neurons in the brain and we can likely achieve this with current generation imaging techniques.
E11 Bio
E11 Bio is a focused research organization that is, well, focused on researching connectome mapping. They have a cool technique combining expansion microscopy and genetic barcoding to make tracing neurons much much easier.
I discussed the limitations of electron microscopy above. Well, expansion microscopy is a cool way to get around this. The sample can be permeated by a hydrogel that swells causing the whole thing to expand roughly homogenously. This can be up to 10x in a single step iirc but that cool thing is that you can do it multiple times if you really want. E11 Bio is doing 5x which I trust is sufficient for their need. The genetic barcoding is a way to have functionally infinite color channels such that you can uniquely identify each neuron. I’m not natively a genetics guy so I might summarize this wrong, but my understanding is that each neuron is infected by a random subset of viruses that are injected into the brain. Each virus codes for a specific protein that can be bound to antibody stains. By sequentially staining and then washing away antibodies bound to fluorescent probes you can image the sample once for each possible virus. Each neuron will either be infected or not for each given virus and so it will either fluoresce or not for each given stain/image/wash cycle. This gives each one a unique bit string to identify it even across long projections. All in all, very cool and computationally more simple than trying to segment cell images taken in grayscale. It only marginally improves (~5x fewer errors) the automatic segmentation accuracy and would still rely heavily on human proofreading[4]. But still, very obvious step in the right direction and I am glad to hear it is being worked on.
You do start to get issues with distortion if you expand too much but then it becomes an engineering trade off. Would you rather the computer have to correct for these distortions, or deal with the numerous physical and computational challenges EM data introduces? I’ll admit I’m biased here but the technology is really cool and opens up a huge range of microscopy techniques giving potential OOMs improvements for imaging and post processing speed. If you are interested in connectome tracing feasibility, I would recommend this paper comparing expansion microscopy to electron microscopy. Their most optimistic timeline for mice is ~5 days but ~30 years for a human brain. 30 years is a long time to wait around, improvements will be made in speed and cost allowing more work to be done in parallel but it is unclear if imaging a whole human brain in sufficient resolution will be feasible any time soon.
What I Would Work On
Based on the above, there are several key problems that I think need to be addressed if we want to do whole brain emulation. This is by no means an exhaustive list, these are things I can point to as clearly identified gaps.
High throughput imaging with sufficient detail, ideally less than 10nm in all directions[5]
Improve mechanical or destructive sectioning to gather all necessary information while minimizing artifacts at the boundaries
Speed is the biggest consideration, this can be achieved by bringing cost down so more microscopes can operate in parallel or by making each on faster without increasing cost proportionally
More sub cellular detail
Find the density of ion channels for a particular neuron
Identify gap junctions between neurons
Identify the neurotransmitters used by each neuron more accurately[6]
A way to extract information relevant to neuromodulation, this is not possible or extremely hard with EM data
Improve automated segmentation, eliminating the need for any human proofreading is ideal
More advanced modeling of cells with verification that the subcellular details listed above recreate the electrical and chemical activities accurately
This is a lot of data, you need a lot of storage and fast transfers to avoid that bottlenecking the microscopes[7]
I am still really worried about biological learning rules, I don’t think anyone understands those well enough that we could make a WBE of a mouse and have it memorize a maze or something. This is a drum I beat frequently but this is not the time to go into the gory details and honestly I should know more than I do before making such sweeping claims.
MERFISH can measure a specific subset of genes optically. It requires multiple steps to attach and detach the antibodies but because it optical it can be done in parallel with large FOV microscopes. I am unsure if it can be combined with E11’s PRISM but if it could I think that would be super neat and should not add any time.
As far as I know, nobody has used antibody staining to measure ion channel densities and create a corresponding, accurate, biophysical model. If such a thing exists, this sections is largely moot but I would be really happy to read that paper.
I’m not doing the “accuracy” metric justice in that sentence or this footnote. It breaks down into a few sub problems. There is identifying which cell is which and then there is identifying which cells are connected. There are the cells falsely being split apart leaving something just hanging out unassigned or parts being falsely merged with the wrong cell. Bottom line is this: if you know how to do computer vision you should work on this problem, it is important and cool!
As said previous, expansion microscopy lets you get away with a microscope that does not have that high resolution natively. If you have a 10x expansion factor you can have a resolution of 100nm. The fruit fly brain was mapped with 4x4x40 nm voxels.
It is often assumed that they only use one, this is called Dale’s Law and is not 100% accurate. It is unclear to me how important the second or third most commonly used neurotransmitter is to a particular neuron or the computation at large.
I hesitate to put this here because it feel like a problem that will be solved by the normal computer industry well before it becomes a real issue for WBE but it was mentioned as a serious problem, exabytes of data are no joke
Summary
We have [relatively] recently scanned the whole fruit fly brain, simulated it, confirmed it is pretty highly constrained by morphology alone. Other groups have been working on optical techniques and genetic work to make the scanning process faster and simulations more accurate.
Fruit Flies When You’re Having Fun
The Seung Lab famously mapped the fruit fly connectome using serial section electron microscopy. What is underappreciated is that another group used this information to create a whole brain emulation of the fruit fly. Now, it used leaky integrate and fire neurons and did not model the body of the fly, but it is still a huge technical achievement. The first author has gone off to work at Eon Systems, which is very explicitly aimed at human whole brain emulation.
They did some cool things in the simulation. One is that they shuffled the synaptic weights to see how much that changed the neural activity. Turns out, quite a bit. This is a good thing because it means they’re probably right about how synaptic weight manifests in morphology.
I would recommend reading the whole paper. I think I would do it a disservice by giving an intermediate level of detail in a summary. They just got mind-blowingly good results for such a simple model and it really gives me hope that the actual simulation aspect is a much more tractable problem than I once thought[1].
Connectome Tracing Now And The Near Future
The two biggest issues with connectome tracing right now are speed and accuracy. It takes a long time to image all the samples and is very costly to parallelize the process because electron microscopes are expensive. As for accuracy, it seems like it would be unreasonable to ask for more resolution than an electron microscope offers. This is true but because everything is grayscale segmentation becomes hugely challenging. One of the biggest bottlenecks in the pipeline is human proofreading of the data. We have a good algorithm for this, but it does require a substantial human effort after the first pass. The whole fruit fly brain took ~33 years of human proofreading to complete. Accuracy stays around 90% in the most optimistic case without human involvement. A naïve extrapolation from the fly → mouse brain time would be ~10,000 years of proofreading which is suboptimal. Additionally, many of the proofreaders were trained in neuroanatomy which would further increase the difficulty of using human workers for this process. So yeah, I really want people to work on this problem it seems very important to me.
I am of the opinion that electron microscopy is not the way forward because of these factors and others that will be discussed later. Still, it is the only proven technology and there may be a place to do a hybrid approach with optical providing some information using traditional stains and electron microscopes providing the highest possible resolution.
There are also issues of sample preparation and the exact kind of electron microscope you use. Samples must be sectioned very thin in the axial direction as scanning microscopes can’t see subsurface detail and transmission microscopes have limited penetration depth. If the samples are cut mechanically they generally have artifacts from that which make segmentation across the boundary more challenging. Samples can be destroyed with ion milling or treated such that they photodegrade, this leaves a much cleaner surface for the next imaged section but destroying the samples makes multiple imaging steps challenging.
For much, much more detail I recommend reading this projection for what would it would take to image a whole mouse brain.
Multimodal Data Analysis
There are two big obvious limitations with the fruit fly simulation. The first is that it does not even attempt to model the rest of the fly’s body. I’m comfortable with this, people have been trying to simulate C. elegans for decades now and they still don’t have a complete biophysical model. This is a big challenge, but not my chief interest. The second limitation is in their cell model itself. They used a leaky integrate and fire model that was identical for each neuron. I understand why they did this and I don’t think they actually could have done much better with the data they had, but they also openly admit this is a limitation. Well, there is some recent progress that addresses this gap.
Neurons are inhomogeneous in many ways. One is electrical activity, two neurons will spike differently when given the same current stimulus. Another is gene expression. There are a lot of genes that are known to govern ion channels which determine the electrical activity in the neuron. It is a very natural to ask whether or not you can predict the electrophysiology properties of the neurons based on the gene expression. Well, one recent paper sets out to answer just that. I would say that the big conclusion relevant to this is
Despite this, I am still confident about this technique being viable for generating models of individual neurons. Why is that? Because the technique they used to measure gene expression is known to be inaccurate. Other methods of measuring gene expression, I am a proponent of MERFISH[2], are comparable or perhaps even better. In the event that these techniques remain inaccurate or are insufficient by themselves, it seems likely that traditional antibody staining could allow for direct measurement of ion channel density[3].
I also want to make it very clear that I have an extreme admiration for the work that they did. I personally tried using some of the same data to achieve the gene expression to biophysical model transformation and can attest to the fact that it is quite challenging. Their paper has a lot of stuff in it I wish I tried, and is quite readable in my opinion. Specifically, I applaud them for trying to fit a relatively simple model. One of my biggest frustrations when I read neuroscience papers are people trying to answer questions they clearly didn’t lay the groundwork for.
Now, even once that is achieved we will still be missing some important factors like how hormones or peptides influence the activity of a neuron. But this is a step in the right direction. Knowing the connectome with weights gets you a long way, making specific cell models gets you closer, knowing the effects of hormones, peptides, blood flow, whatever glial cells are doing, etc. matters but might have a collectively smaller effect than the first two factors. I am not sure how confident I am in that statement, biology is an bottomless well of complexity and some of those higher order effects could be much more important than I appreciate. But all this is really just dancing around my main opinion, which I endorse quite strongly, that we need to have a model more specific than a single template leaky integrate and fire neuron for most of the neurons in the brain and we can likely achieve this with current generation imaging techniques.
E11 Bio
E11 Bio is a focused research organization that is, well, focused on researching connectome mapping. They have a cool technique combining expansion microscopy and genetic barcoding to make tracing neurons much much easier.
I discussed the limitations of electron microscopy above. Well, expansion microscopy is a cool way to get around this. The sample can be permeated by a hydrogel that swells causing the whole thing to expand roughly homogenously. This can be up to 10x in a single step iirc but that cool thing is that you can do it multiple times if you really want. E11 Bio is doing 5x which I trust is sufficient for their need. The genetic barcoding is a way to have functionally infinite color channels such that you can uniquely identify each neuron. I’m not natively a genetics guy so I might summarize this wrong, but my understanding is that each neuron is infected by a random subset of viruses that are injected into the brain. Each virus codes for a specific protein that can be bound to antibody stains. By sequentially staining and then washing away antibodies bound to fluorescent probes you can image the sample once for each possible virus. Each neuron will either be infected or not for each given virus and so it will either fluoresce or not for each given stain/image/wash cycle. This gives each one a unique bit string to identify it even across long projections. All in all, very cool and computationally more simple than trying to segment cell images taken in grayscale. It only marginally improves (~5x fewer errors) the automatic segmentation accuracy and would still rely heavily on human proofreading[4]. But still, very obvious step in the right direction and I am glad to hear it is being worked on.
You do start to get issues with distortion if you expand too much but then it becomes an engineering trade off. Would you rather the computer have to correct for these distortions, or deal with the numerous physical and computational challenges EM data introduces? I’ll admit I’m biased here but the technology is really cool and opens up a huge range of microscopy techniques giving potential OOMs improvements for imaging and post processing speed. If you are interested in connectome tracing feasibility, I would recommend this paper comparing expansion microscopy to electron microscopy. Their most optimistic timeline for mice is ~5 days but ~30 years for a human brain. 30 years is a long time to wait around, improvements will be made in speed and cost allowing more work to be done in parallel but it is unclear if imaging a whole human brain in sufficient resolution will be feasible any time soon.
What I Would Work On
Based on the above, there are several key problems that I think need to be addressed if we want to do whole brain emulation. This is by no means an exhaustive list, these are things I can point to as clearly identified gaps.
I am still really worried about biological learning rules, I don’t think anyone understands those well enough that we could make a WBE of a mouse and have it memorize a maze or something. This is a drum I beat frequently but this is not the time to go into the gory details and honestly I should know more than I do before making such sweeping claims.
MERFISH can measure a specific subset of genes optically. It requires multiple steps to attach and detach the antibodies but because it optical it can be done in parallel with large FOV microscopes. I am unsure if it can be combined with E11’s PRISM but if it could I think that would be super neat and should not add any time.
As far as I know, nobody has used antibody staining to measure ion channel densities and create a corresponding, accurate, biophysical model. If such a thing exists, this sections is largely moot but I would be really happy to read that paper.
I’m not doing the “accuracy” metric justice in that sentence or this footnote. It breaks down into a few sub problems. There is identifying which cell is which and then there is identifying which cells are connected. There are the cells falsely being split apart leaving something just hanging out unassigned or parts being falsely merged with the wrong cell. Bottom line is this: if you know how to do computer vision you should work on this problem, it is important and cool!
As said previous, expansion microscopy lets you get away with a microscope that does not have that high resolution natively. If you have a 10x expansion factor you can have a resolution of 100nm. The fruit fly brain was mapped with 4x4x40 nm voxels.
It is often assumed that they only use one, this is called Dale’s Law and is not 100% accurate. It is unclear to me how important the second or third most commonly used neurotransmitter is to a particular neuron or the computation at large.
I hesitate to put this here because it feel like a problem that will be solved by the normal computer industry well before it becomes a real issue for WBE but it was mentioned as a serious problem, exabytes of data are no joke