[ Question ]

What's holding back outsourcing to cloud labs?

by ChristianKl1 min read28th Oct 20205 comments


World Modeling

I would have expected Emerald Cloud Lab or similar competitors to go a lot and be successful over the last five years. As far as I know, like Emerald Cloud Lab only had modest growth and there aren't competitors who grew strongly. Outsourcing to cloud labs seems like it allows the laberatory to have benefits of scale and virtualization that drives down costs and is easier to use then working in a wet lab. Is there something holding back this trend that I'm not seeing? Alternatively, what's going on?

New Answer
Ask Related Question
New Comment

1 Answers

Disclaimer: I do not work in a lab, and never have beyond a short stint as a research assistant in undergrad.

That being said, I can think of several reasons. In no particular order:

  • Habit: researchers are used to what they have been doing, and do not want to change.
  • Control: they are a hands-on scientist who likes to roll up their sleeves.
  • Secrecy: if a cloud lab does the experiment, then a cloud lab has the data.
  • Not Free: why spend money on a cloud lab when they have already-paid-for equipment and interns in their lab?
  • Training: most of the actual routine work is done by grad students; this is a critical part of their training as scientists. If experiments are outsourced, how will they learn to use the equipment and to design experiments of their own?
  • Cognitive burden: another tool chain to remember? It's not even a python script!
  • Publication bias: I have read several accounts of papers being rejected because they used the wrong code in their analysis; the reviewers preferred R or Python. Do any journals accept a description of a proprietary software workflow from a single company in the methods section?
  • Experimental design: things have improved from a replication standpoint since the replication crisis, but I don't see much movement on the bias towards novel results. The scalability argument Emerald Cloud Labs is making doesn't appeal as much if the goal of an experimenter is to design the most novel possible experiment.
  • Inadequate discovery equilibrium: this is essentially another facet of the previous point, but researchers may assume that because the experiments are so easy to replicate and scale that their efforts will not be sufficiently rewarded, even if they can think of good experiments to run.
  • Too few non-academic researchers: business investment in R&D has plummeted from its previous levels, as most corporations moved to shift investment into shorter-payoff projects. They are likely not even evaluating this kind of product anymore.
  • Competition from computers: a significant chunk of the big data/machine learning revolution is going into producing better models and simulations; this directly competes with the repeatability and scalability pitch that cloud labs are making. Come to think of it, the best use might be validating or building a model or simulation.
4ChristianKl1moMy expectation is that nothing coming out of big data/machine learning models at the moment is going to be trusted directly but needs to be verified in actual experiment. Do you believe differently?
2ryan_b1moOnly slightly, and that a matter of emphasis. In my view the crux of the matter is the relationship between modelling and the traditional lab is very similar to the relationship between a cloud lab and the traditional lab; both are adding value by improving scale and repetition. Weighing against my point, it does appear to me that the areas where modelling is emphasized the most are areas where experiments are very difficult or impossible, like nuclear fusion or climate science. I do not see anywhere on Emerald Cloud Labs' website claims that they offer experiments which cannot be achieved in a traditional lab. This leads me to suspect that the feedback loop between modelling and the traditional lab is better than that between a cloud lab and a traditional lab, because in spite of the similar value-add pitch, it remains the case that the cloud lab is primarily a substitute for the traditional lab, and modelling is primarily a complement. Another detail I thought of: we remain stuck very much in the mode of hypothesis->experiment->data being a package deal. If became popular to disentangle them, like through likelihood functions [https://arbital.com/p/likelihoods_not_pvalues/] or through one of the compression paradigms [https://www.lesswrong.com/posts/hAvGi9YAPZAnnjZNY/prediction-compression-transcript-1] , then bulk data generation becomes independently valuable and it would make a lot of sense to run lots of permutations of the same basic experiment, without even a specific hypothesis in mind.
2ChristianKl1moYes, it's more about being able to do experiments more efficiently then about making new kinds of experiments. The problem of modeling is that the modeling results are not the real world. If you care about which molecule binds to which protein you can model reactions for a lot of different reactions to find good candidates to validate in real experiments. The cloud lab actually gives you the real experiment.
1 comments, sorted by Highlighting new comments since Today at 11:41 PM

Possibly-relevant subquestion: how do grantmakers feel about grantees using cloud labs?