Since the responses to my recent inquiry were positive, I've rolled up my sleeves and gotten started.  Special thanks to badger for eir comment in that thread, as it inspired the framework used here.  

My intent in the upcoming posts is to offer a practical overview of biological topics of both broad-scale importance and particular interest to the Less Wrong community.  This will by no means be exhaustive (else I’d be writing a textbook instead, or more likely, you’d be reading one); instead I am going to attempt to sketch what amounts to a map of several parts of the discipline – where they stand in relation to other fields, where we are in the progress of their development, and their boundaries and frontiers.  I’d like this to be a continually improving project as well, so I would very much welcome input on content relevance and clarity for any and all posts. 

I will list relevant/useful references for more in-depth reading at the end of each post.  The majority of in-text links will be used to provide a quick explanation of terms that may not be familiar or phenomena that may not be obvious.  If the terms are familiar to you, you probably do not need to worry about those links.  A significant minority of in-text links may or may not be purely for amusement.

It is a popular half-joke that biology is applied chemistry is applied physics is applied math.  While it’s certainly necessary to apply all the usual considerations for a chemical system to a biological system or problem, there are some overall complications and themes that specifically (though not uniquely) apply to biological problems, and it is useful to keep them in mind. 

1.  Biological processes are stochastic.

Cellular-scale chemistry is an event-dense environment, but the abundance of most reactants is generally quite low.  (Exceptions typically include oxygen, carbon dioxide, water, and small ions.)  Beyond the basic consideration of abundance, there are other layers of regulation that determine whether a given entity, usually a protein, can actually react at any given time, and complicated geometries involved that further decrease the frequency of a given reaction.  We therefore must consider the majority of reactions as discrete, and model them stochastically. 

If we take a step up to the scale of cells in culture – to give a ballpark idea, we’re talking on the order of 106-109 cells/mL for yeast or bacteria – the overall flux of nutrients, waste, and other metabolites becomes much more stable across samples under uniform conditions.  However, even with relatively well-behaved cells in liquid culture, sample variability is such that it is necessary to take replicate samples for each condition when using these in an experiment. 

Taking another (large) step up the complexity hierarchy to consider multicellular organisms (and let’s make them genetically identical, as in laboratory strains of fruit flies or worms), we now have aggregates of individual stochastic processes, which themselves exhibit stochasticity at the organism level.  And, as you might expect, the same holds for bigger, more complicated, and genetically non-identical organism. 

The functional implications of this behavior are:

  • If we wish to model biochemical processes on a molecular scale, we must account for stochastic behavior.
  • Careful statistical analysis is intrinsically necessary to studying biological systems at all levels. 

2. Biological systems are complex networks, and variable deconvolution is nontrivial. 

Stochastic processes aren’t especially difficult to model if you have access to accurate probability predictions for the set of possible events…  but getting those accurate predictions is a process that in biological systems is painstaking at best. 

If you’re trying to model a single node in a biochemical pathway – say, an enzyme catalyzing the cyclization of a linear cholesterol precursor - you have to consider the upstream and downstream reactions’ effect on that node, the interaction of your whole pathway with other biochemical pathways in the cell, and then the chemical environment of that cell, which could be as simple as a uniform flask of culture medium or as complex as a human tissue, which is within a human body, which… you get the idea.  (This is somewhat hyperbolic for purposes of illustration.  In reality, you can probably get away with assuming either a near-constant culture medium or some known dynamic cycle of states in a multicellular tissue, depending on the application you’re looking at.) 

In other words, there are a lot of variables, and almost none of them are fully independent.  It is therefore necessary to expend a great deal of time and resources to characterize these variables, relative to non-biological systems.  Even for our most-studied, favorite single-celled organisms, whose genomes we’ve sequenced and whose metabolisms we’ve begun to model, there are still huge blank areas in our lists of variable definitions. 

I’ll be discussing experimental paradigms and difficulties with specific systems in later posts, as well as the methods used to deal with them. 

3. Due to (1) and (2), modeling efforts are heavily limited by computing power. 

This is worth mentioning here, but fairly self-explanatory, although it’s also worth noting that a specific consequence of the interconnectedness of system branches is a vast range of time scales on which events occur.  This time scale diversity makes system-level modeling a very stiff endeavor. 

I’ll go into some of the more successful and potentially successful modeling strategies for various systems in later posts. 

4. Information transfer is error-prone in biological systems. 

Genes are replicated by a sequential polymerization reaction that constructs a new strand of DNA using the old one as a template.  Each time a monomer is added to the new strand, there is a small* chance that the incorrect type of monomer will be added, and a smaller chance that the error will not be recognized and corrected by proofreading mechanisms.  Since genes are on the order of thousands of monomer units (‘bases’) long, in aggregate these mutations have a large chance of at least some occurrence over the course of the cell’s lifetime.  (Mutation probabilities differ by organism, by gene, and in accordance to a host of other factors.) 

Aside from this familiar form of error in the genetic code itself, abnormalities can also occur in how DNA is partitioned between new units in a dividing cell, and on a less permanent level, there are many phenomena that amount to ‘miscommunication’ between parts of a cell, or whole cells or organs. 

Functionally speaking, this rather patchy scheme of information fidelity gives rise to the phenomenon of evolution, a great deal of useful experimental methods oriented around inducing mutations, and the occasional thorn in the side of a researcher who has suddenly found eir cell line has lost a trait that it needed to have or gained one it didn’t.  On the broad scale, it also makes the study of biological systems something of a moving target, particularly in mutation-prone systems such as human pathogens. 

* This chance has actually been estimated for various situations, and tables of these estimates are used heavily in mapping evolution.  (NB: This inference is descriptive only; it is NOT predictive.) 

5. Biological processes are limited to life-sustaining conditions. 

…Or, in chemical engineering terms, it is not advisable to blow up your reactor. 

The kinds of chemistry we can convince cells to do for us are those that are not toxic to the cell, and that do not completely overwhelm the cell’s ability to handle its own biochemical needs.  There are ways to partially circumvent this – you can sometimes get away with slightly toxic products in an engineered metabolic pathway, or if you have to completely hijack some part of the cell’s essential machinery you can sometimes provide it with whatever it’s missing externally – but it’s a rule that can only be bent so far before you’ve got dead cells on your hands.  (And chances are, unless they were cancer cells, that’s not what you wanted.) 

Biological systems also exhibit a high degree of organization, allowing the partitioning of microenvironments necessary to support the full spectrum of biochemical reactions.  The maintenance of this organization is just as vital as temperature and pH homeostasis, and the avoidance of toxin buildup. 

6. Biological processes are transport-constrained. 

For similar, but more complex reasons as (1), chemical transport is a Big Deal in biology from the cellular level all the way to a clinical setting.  Due to (5), we can’t just put everything in a blender and assume perfect mixing (though I’m sure some people would be happy to try), so to a certain extent on a cellular level (depending on the complexity of your cells and what you’re trying to make them do), and to an all-consuming extent on an organism level, biological problems contain transport problems. 

The easiest illustrative example of this is to consider cancer, and conventional chemotherapy treatment.  You’ve got a patient with a tumor, and they’re receiving chemotherapy.  The chemotherapeutic chemicals are injected into the bloodstream, which they then ride through the body to the tumor, and get to work.  Except… since they took the scenic route getting there, they’ve also come into contact with a lot of erstwhile healthy tissue, which they have also attacked, producing the host of nasty side effects that comes with chemo.  You could inject the drugs directly into the tumor or the tissue surrounding it, but then you’ve got to hope the drugs manage to diffuse far enough into the tumor to do some good despite the fact that they aren’t riding any blood vessels.  This is the sort of transport-focused engineering problem that is necessary to solve in some capacity for nearly all clinical applications. 

Given these considerations, much of our most productive, ground-breaking research** in biology and bioengineering today is focused on:

  • Finding new ways to model systems of interest
  • Finding more efficient ways to update our existing models and understanding (more efficient variable characterization)
  • Designing streamlined, well-behaved systems based on our emerging understanding of how all these processes work individually and link together (‘plug and play’ biology)

**Reflects my engineering-slanted opinion on the future of biology, as well as that of most people who are doing bioinformatics and are excited about it, but could be open to contrary opinion.  ‘Fastest-developing’ would perhaps be a better description. 

 

Useful/interesting references consulted for this section:

Lehninger Principles of Biochemistry, 4th ed., by David L. Nelson and Michael M. Cox

Molecular Biology of the Cell, 4th ed., by Alberts, Johnson, Lewis, Raff, Roberts, and Walter

Receptors: Models for Binding, Trafficking, and Signaling, by Douglas A. Lauffenburger and Jennifer J. Linderman

New Comment
29 comments, sorted by Click to highlight new comments since: Today at 11:29 PM

Um... I may be a bit prejudiced, here, especially considering that this has been highly upvoted, but I have to admit I have two major problems with promoting this post:

1) It says "Constraining Anticipation" in the title, and after reading it, I cannot think of anything I expect to see happen.

2) More importantly, I don't feel I know anything about rationality which I didn't know earlier. I'm not sure LW should be in the business of fully general scientific intros.

This comment might have been more appropriate as a response to this request for expressions of interest or disinterest.

But since it wasn't made there, lets discuss the issues it raises.

I agree that LW should not be in the business of producing scientific intros. But we do have the precedent of the QM sequence. It was justified as relevant to various "demon-exorcising" (my term) tasks relevant to rationality; I don't dispute this justification. But surely an intro to systems biology (including topics of the simulation of biological systems) is relevant in a group which frequently discusses the prospects for "uploading" as a path to AI, brain scanning as a path to inferring preferences, and simulation as a technique for cryonic resuscitation.

After reading this posting, I find it very difficult to expect this kind of simulated biology to be possible soon, and I also find it less easy to anticipate that synthetic life research will deliver anything particularly interesting in terms of nanotechnology soon. In other words, my anticipations have been constrained by this excellent article.

[-][anonymous]13y40

But surely an intro to systems biology (including topics of the simulation of biological systems) is relevant in a group which frequently discusses the prospects for "uploading" as a path to AI, brain scanning as a path to inferring preferences, and simulation as a technique for cryonic resuscitation.

Indeed; it's quite relevant, yet a superficial glance at the site doesn't show much sign it's been touched on at all -- biology does not seem to be a common field of expertise or even interest here (though that is merely an informal impression on my part); even the interest taken it around intelligence, cryonics, uploading and brain scanning appears to be of secondary importance to the interest in those topics themselves.

Your reaction to this post is not a comfortable one for the bulk of users here to contemplate, I suspect. If cryonics turned out to be overrated in the LW consensus, or brain uploading turned out to be infeasible (somewhat different from "impossible in principle"), it would dramatically constrain many of the scenarios and ideas that a large number of LW users are hoping to benefit from and/or contribute to.

It's nearly impossible to discuss those topics meaningfully, however, without delving into the specifics of biology -- and that apparent dearth of specific information may cause some users to overinflate (or underinflate!) their priors about various relevant issues.

biology does not seem to be a common field of expertise or even interest here

Evolution, evolutionary psychology, cognitive science (i.e. neurology), and diet seem to be significant interests. That is, I agree with you that it's quite relevant and that we don't seem to have much expertise here, but I think we have the interest.

[-][anonymous]13y10

nods True enough where those are concerned -- I may be expressing this unclearly.

What I mean to indicate is that there's some obvious interest in the bits of biology that bump up against rationality philosophy and this community's aggregate, extrapolated desireable "FOOM" scenarios, but it comes across as very topical interest. Like focusing on the pretty flower, considering pretty flowers relevant to their interests, but ignoring the branch, and only dimly interested in the tree as the thing-that-holds-flowers.

Biology as its own field isn't going to be of interest to everybody, of course, but it's troubling to see that a lot of the discussion about biology that goes on around here seems, well, incomplete and backward. Relying heavily on pop-science and "celebrity" biologists to speak for the entire field (much of Eliezer's writing about the topic), reasoning from first principles about stuff that's too embedded in context for the resulting, logically-valid ideas to apply soundly to real biology, and a tendency to oversimplify the subject matter or just ignore relevant bits, either because of a lack of knowledge or a very limited one -- and a tendency to reason forward from there, leading to what seem like GIGO issues in the resulting model.

(This is a long-winded way of agreeing with your statement that there's interest but no expertise; I just think the distinction's important enough to make between superficial interest in obvious, surface-level attractors and a deeper focus on the body of knowledge giving rise to them. It doesn't fly here to talk like this about economics, logic, or philosophy -- if your knowledge is that limited, you'll be directed to the sequences.)

On the title - the idea was, for this post specifically, to sketch the general principles that define both the space of reasonable approaches and likely outcomes in biological problems. I do think I did an underwhelming job demonstrating that link, and if that is what you mean or close to it, then I agree with you and will take it as a reminder to work on cohesion/full clarity of purpose in future posts. (If it's not, I invite further clarification.)

As for whether it's appropriate for LW... well, since I have a fairly good idea of what I'm going to write on the subject in the future, I think it is, because I intend to keep it targeted and relevant to issues the community has interest in - offering either another angle from which to consider them, or more background information from which to evaluate them, or ideally both. As I've said before, I've no desire to write a textbook, and there's plenty of other places on the internet we could go if we wanted to read the equivalent of one.

However, if you don't think that is enough to be relevant here, I would very much like to hear what, if anything, would make such a set of posts relevant to you (not trying to shift the reference frame - I mean relevant to you in the context of LW). The large positive response I received previously and in this post indicates to me that it's worth continuing in some form.

As for whether it's appropriate for LW... well, since I have a fairly good idea of what I'm going to write on the subject in the future, I think it is, because I intend to keep it targeted and relevant to issues the community has interest in - offering either another angle from which to consider them, or more background information from which to evaluate them, or ideally both. As I've said before, I've no desire to write a textbook, and there's plenty of other places on the internet we could go if we wanted to read the equivalent of one.

To me it seems sufficiently relevant for a front page post. Just not a promoted front page post. :)

I agree. I didn't actually expect it to get promoted, since it doesn't fit the pattern of things I've seen on the very front. I'll show how new I am here and ask, though - Eliezer's comment read like he had been presented with some expectation that this be promoted. Is that because posts that get upvoted this far typically (or always) are?

Since I didn't ask, or state that I thought it should be, it seemed a bit out-of-the-blue, which did then and is still causing me to try to figure out whether his objection was only to the idea of promotion, or if he objected to promotion because he thought it shouldn't be here at all.

Nice article, just a minor quibble:

Unlike what the title and headers imply, this article seems to be much more about biochemistry than biology - is that because my perception of the place of biochemistry in biology is flawed? Is it because this is a building block for later posts, that will breach other subjects in biology?

It's a foundation - it's easiest to illustrate the patterns I'm describing on a molecular/cellular level, but they apply across the board. My current intent for the actual series is to start with a group of posts on molecular/cellular systems, both because a basic understanding of genetics and metabolism is extremely useful to understanding everything else, and because it's the area I'm most familiar with.

However, recognizing that about half the interest expressed in the suggestions thread was for topics above the molecular level, I'm trying to figure out how to do some posts on them earlier without making things disjointed/difficult to follow. I might settle for weaving in short bits about how molecular topics will apply to macroscale ones later.

Unlike what the title and headers imply, this article seems to be much more about biochemistry than biology

This is definitely not Biology 101.

I can't really argue with that. I've been going back and forth with myself over whether I should call it something different. Suggestions?

I'm afraid I'm not sure what you like to call stuff within your field. But if I was going with the university subject metaphor and pulling something out of thin air it'd be:

BIO253: Modelling Cellular Systems

A second year Bio subject with a prereq of BIO101 and two semesters of maths and stats. :)

(Note: If I was actually within the field I expect at I would cringe at the inaccuracy.)

Hah, no, that does sound like a real course title, although usually they call it "cellular engineering" to sucker in more people who would be turned off by an explicit mention of math in the title.

(I kid. Mostly.)

It is only a small subset of what I want to cover, though. I shall continue to think on it.

How about "LW Biology 101 Introduction: Bases of Biochemistry"?

I guess it depends of what you're going to talk about in the rest of the sequence.

This is great. I'm looking forward to future posts. My background is in economics, and now I'm intrigued whether your thoughts on modeling biological systems might have some relevance for economic systems. Keep up the good work.

[-]Cyan13y30

I really like this article, but I have a background in the subject material. How do folks who don't have prior familiarity with the subject matter find the inferential distance?

I'm a physicist who took a biology class once, and I found it easy to follow.

I'm hoping that I'll be able to keep the posts within the realm of reasonable understanding for most people on this site by focusing on principles, patterns, and analogies to other fields; however, if at any point I'm failing to do so, I will ardently welcome that being pointed out.

The assumptions I made when constructing my tentative post outline were that readers here were likely to have some general scientific background, and at least a high school level of chemistry. I recognize that the latter might not be a good assumption.

(If you, or anyone else has suggestions at any point on how to improve the usefulness of these posts for those without a background in related fields, please let me know!)

I haven't done much biology since my high school days (apart from reading gnxp and the occasional article on the 'net / Wikipedia, and hanging out with biology students in university), and didn't find it too hard to follow.

I noticed the texts cited at the bottom (in particular Molecular Biology of the Cell, as that is something I am currently reading and enjoying).

Do you have any particular texts on biology you might want to recommend ?

Yes! Thank you for linking that thread; I hadn't seen it.

Slight criticism: I would have liked even more examples - particularly of the ground-breaking research at the end.

I'm compelled to link to these posters on biochemistry. I find it amusing to browse them (although I am not an expert by any means).

Interesting!

Reflects my engineering-slanted opinion on the future of biology

I have a non-engineering-slanted preference for the future of biology; this is all quite scary. Given the direction that biology is going, dangerous thing will soon be widely accessible. FAI is hard, but at least AI is too. As a member of the field, what are your opinions on this?

If you mean my opinion on whether it's worth being afraid of - I don't think it is. Any powerful new technology/capability should be implemented with caution and an eye to anticipating risk, but I don't view bioengineering in a different capacity than any other scientific frontier in terms of risk.

On a practical level, the oversight on manipulation of organisms beyond your run-of-the-mill, single-celled lab workhorses (bacteria, yeast) is massive. In the not-too-distant past, it was an uphill climb just to be able to do genetic engineering research at all.

I got a lot of questions about 'bacteria FOOM,' if you will, around the time the synthetic bacterium paper came out. The short version of my answer then is worth repeating - if we want to make super-germs or other nasty things, nature/Azathoth does it quite well already (ebola, smallpox, plague, HIV...). Beyond that, this sort of research is exceptionally time- and resource-consuming; the funding bottleneck reduces the chances of the lone mad scientist creating a monster essentially to nil. Beyond even that, putting some DNA in a cell is not hard, but designing an idealized, intelligent organism on the level of strong AI is at least as hard as just designing the AI.

So my stance is one of.... let's call it exuberant caution. Or possibly cautious exuberance. Probably both.

I have updated based on this evidence.

One follow up question:

On a practical level, the oversight on manipulation of organisms beyond your run-of-the-mill, single-celled lab workhorses (bacteria, yeast) is massive.

Is this sort of thing not changing?

To the best of my knowledge - and that deserves a disclaimer, since I'm a grad student in science and not yet completely versed in the legal gymnastics - it is changing, but any loosening of policy restrictions only comes with exceptional evidence that current norms are grossly unnecessary. In a general sense, bioengineering and tech started out immersed in a climate of fear and overblown, Crighton-esque 'what-if' scenarios with little or no basis in fact, and that climate is slowly receding to more informed levels of caution.

Policy also assuredly changes in the other direction as new frontiers are reached, to account for increased abilities of researchers to manipulate these systems.

Thanks for the reply.