Hello, Less Wrong!

This seems like a community with a relatively high density of people who have worked in labs, so I'm posting here.

I recently finished the first draft of something I'm calling "The Hapless Undergraduate's Guide to Research" (HUGR). (Yes, "HUGS" would be a good acronym, but "science" isn't specific enough.) Not sure if it will ever be released, or what the final format will be, but I'll need more things to put in it whatever happens.

Basically, this is meant to be an ever-growing collection of mistakes that new researchers (grad or undergrad) have made while working in labs. Hundreds of thousands of students around the English-speaking world do lab work, and based on my own experiences in a neuroscience lab, it seems like things can easily go wrong, especially when rookie researchers are involved. There's nothing wrong with making mistakes, but it would be nice to have a source of information around that people (especially students) might read, and which might help them watch out for some of the problems with the biggest pain-to-ease-of-avoidance ratios.

Since my experience is specifically in neuroscience, and even more specifically in "phone screening and research and data entry", I'd like to draw from a broad collection of perspectives. And, come to think of it, there's no reason to limit this to research assistants--all scientists, from CS to anthropology, are welcome!

So--what are some science mistakes you have made? What should you have done to prevent them, in terms of "simple habits/heuristics other people can apply"? Feel free to mention mistakes from other people that you've seen, as long as you're not naming names in a damaging way. Thanks for any help you can provide!

 

And here are a couple of examples of mistakes I've gathered so far:

--Research done with elderly subjects. On a snowy day, the sidewalk froze, so subjects couldn't be screened for a day, because no one thought to salt the sidewalks in advance. Lots of scheduling chaos.

--Data entry being done for papers with certain characteristics. Research assistants and principal investigator were not on the same page regarding which data was worth collecting. Each paper had to be read 7 or 8 times by the time all was said and done, and constructing the database took six extra weeks.

--A research assistant clamped a special glass tube too tight, broke it, and found that replacements would take weeks to come in... well, there may not be much of a lesson in that, but maybe knowing equipment is hard to replace cold subconsciously induce more caring.

New to LessWrong?

New Comment
32 comments, sorted by Click to highlight new comments since: Today at 3:41 PM

One key meta mistake you see a LOT in computational biology is people not seeking out the proper expertise they need. I and countless other people have wasted months re inventing existing tools because I had no idea they existed, which is turn was because there were no experienced researchers around me with the relevant expertise to tell me.

Indeed! I found this to be an extremely helpful resource w/r/t seeking out "meta-expertise":

http://faculty.chicagobooth.edu/jesse.shapiro/research/CodeAndData.pdf

Key quote: "Here is a good rule of thumb: If you are trying to solve a problem, and there are multi-billion-dollar firms whose entire business model depends on solving the same problem, and there are whole courses at your university devoted to how to solve that problem, you might want to figure out what the experts do and see if you can't learn something from it."

If I had only had this advice at the beginning of my PhD, I would have saved myself a lot of hassle....

Also, the above advice would suggest, for instance, that we should use SAP's ridiculous, bloated crapware to manage human resources etc... Sometimes the multibillioner dollar companies fail.

Well, "learn from it" and "use the crapware" can mean different things. I've found useful the rule of thumb that "someone else once had your problem and you should find out what they did, even if they failed to solve it".

A friend blew up his lab and put himself and another student in the hospital due to flying glass wounds because they were flooding a glass vessel with gas under too much pressure, and maybe some other complicating details like the glass was made extra brittle by Joule-Thompson cooling.

General note, in industrial processes there are lengthy safety engineering processes undertaken before anything is actually built, but research labs are basically like magic in HPMOR. Deadly dangerous and with few of the proper warning labels.

This applies to engineering labs, too, which sometimes do a better job of teaching the "hubris leading to downfall" trope than some literature departments. Our permanent staff knew everything inside and out, and worked there for decades with little incident. The undergraduates had a semester class to hammer in safety rules, and so my subsequent robot design project only left me with a tiny scar from when I stupidly assumed a rule wouldn't apply. The graduate students were expected to already know what they were doing, and during my senior year one lost a thumb when that expectation proved false.

Not measuring twice before I cut (machining), directly after my adviser told me this.

Before that happened:

I'm not sure if I can count it as a mistake, since it's mostly his fault, but a month or so after I started my adviser came up to me and asked why I had been drawing all these plans, and not using the specs. No one told me that someone else had already designed the laser I supposed to be building.

Wasting time fixing a problem rather then starting a fresh i.e. biology/genetics - rather then just ordering in new primers(oligos), reagents(Taq, nucleotides etc) wasted a few weeks trying to find out which specific component was faulty. All the components are not very expensive while time tends to be - especially for scholarships.

I once crashed the scanning tip of a scanning electron microscope into the sample when my attention wandered for a few seconds while I was adjusting the focus. The lab techs had made it very, very clear to me beforehand that I was never to let the tip and sample get less than a few centimeters (I forget the exact value, but it was specified) apart, because the scanning tip was very expensive and fragile. My moment of inattention ended up costing the lab $10,000, and me any possible friendships with the lab staff.

One lesson is, "Be careful!" but that is tough to actually put into practice. It's precisely when you're not being careful that you need the advice the most. A more actionable piece of advice might be, "Regard scanning and thinking as two separate tasks. Plan out where you're going to scan, then stop thinking. Then scan. Then think again. Do not think and scan at the same time."

The 'scanning tip' of an SEM? Do you mean the pole piece? It's not at all a 'scanning tip' in the same sense as an AFM or STM. Like, there's no reason for an SEM to get closer than a few millimeters from the sample.

A crash like that could only happen if you're moving the stage... were you trying to focus by moving the piece around instead of adjusting the electron optics?

I guess every SEM I've used has had very specific, easy-to-follow instructions on how to avoid crashing.

Yes, my mistake, it was indeed the pole piece. Not something that's supposed to be in close proximity like with an AFM. If I had broken an AFM tip it would've been less of a problem, because those are expected to wear out every so often.

It was a few years ago, but I remember that we were doing e-beam lithography, and that did make it necessary to move the stage around. I think the idea was that our circuit was pre-drawn using software, after which we could just put the diagram into the SEM computer and it would scan around and draw the pattern we wanted. But in order to set this up, it was necessary to precisely locate the initial position of the stage in (x, y, z) so that our pattern would be drawn at the correct location on the silicon. And this meant we had to actually move the stage around, instead of just using the optics to focus on different parts. And due to things like differences in the wafer housing thickness, and other users who had moved the stage, that included moving it up and down.

ETA: All this was done before turning on the electron beam itself, since that would've started burning up the resist. The initial setup was done using a low-power optical microscope inside the SEM.

Upvoted especially for noticing that "Be careful!" is unhelpfully vague, and bothering to think of a usably specific piece of advice.

The best technique I use for "being careful" is to imagine the ways something could go wrong (e.g., my fingers slip and I drop something, I trip on my feet/cord/stairs, I get distracted for second, etc.). By imagining the specific ways something can go wrong, I feel much less likely to make a mistake.

In the HUGR, I've included the advice "learn the sad stories of your lab as soon as possible" -- the most painful mistakes others, past and present, have made in the course of action. Helpful as a specific "ways things can go wrong" list.

[-][anonymous]10y90

Here are a few that I've seen happen time and time again in psychology labs - the main solutions are related to paying specific attention to data management issues that are not typically part of research psychology instruction (but should be).

Failing to keep redundant copies of original data. Even if automatic backup procedures are in place, it is important to make sure they are doing what you think they are doing.

Failing to document original data so that someone else can understand it. I don't know how many files I've come across with experimental conditions labelled "A,B,C" or similar, that can no longer be reconstructed. Including data generated by my past self, I'm sorry to say.

Failing to retain a clear and replicable sequence from original data to "results". If an error creeps in, much harder to figure out where it came from.

Not taking account of multiplicity.

Ideally you should plan exactly how you're going to analyse the data before the experiment but in reality students muddle through a bit.

analyzing data in multiple ways is a big no-no if you're just hunting for that elusive 0.05 P value to get it published.

It's stupid and causes statisticians to tear their hair out(both the arbitrary requirement a lot of journals set and the bad stats by researcher) but it's the reality in a lot of research.

Doing that can be compensated for as long as you keep track of what you tried and make that data available.

It's even worse because often people, including experienced professors, delude themselves with bad stats and waste time and money chasing statistical phantoms because they went significance mining.

Here's one thing from when I did undergraduate research:

I expected grad students to know what they are doing and practice safety. I worked in a lab that had a powerful UV light that we kept off to the side under a cardboard box. The light was used to accelerate curing of UV activated glue. The light was powerful enough to basically give you the equivalent of a sunburn to your eyes if you even glanced at it. When we used the light, we made sure to announce its use and cover it up before plugging it back in. One student in our lab moved the lamp to the middle of the room and turned it on as I was walking through the room right in front of my face. I said "I saw that" and he shook it off as if it were nothing. It was definitely not nothing. Within about 8 hours, I was essentially blind because my eyes were now much more sensitive to light. I spent about two days effectively blind. I don't recall discussing this with the professor I was working under, but it did help me decide to stop working for that professor. I should have discussed this with them. Thankfully, there was no lasting damage to my vision. At this point I use all PPE available, and am a stickler when it comes to safety. Worse, my work is now entirely computational!

Organic Chemistry lab --

Label everything, especially when two subsequent steps of your reaction look very similar.

If you're going to leave something stirring overnight, make sure there's a backup power supply, especially if your area has a history of power failures.

Not mine, but -- If the temperature of your oil bath seems to be going up much more slowly than usual, check to make sure the thermometer is working properly. Don't just turn the heat up until the temperature until the thermometer reads correctly. One of the people in my lab managed to cook his compound at 280 C because the tip of the thermometer was slightly above the surface of the oil bath.

[-][anonymous]10y60

Things take longer than you would expect.

On average, twice as long.

Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.

This doesn't always apply. It can, for example, leave you with an hour to kill at a train station, because you decided it would be really embarrassing to show up late for your ride to a CFAR workshop because of the planning fallacy.

Yes, it doesn't apply when the time is blocked off in advance and can't be reclaimed if it takes less time than you expect.

After reading this, I became incapable of giving finite time estimates for anything. :/

CS grad student here. Some mistakes I made were

  • Not documenting code.
  • Naming figures/datafiles poorly, so that you have no idea what they are in two weeks. It's best to have an automated way of labeling files if you'll be creating a lot of them.
  • Storing data in an inefficient way (very bad if you're generating large amounts of data).
  • Not using version control.
  • Diving right into implementing an algorithm without first thinking about whether that's the best way to solve the problem.
  • Being intimidated by tasks that looked difficult (they were rarely as hard as I thought they would be).

Using idioms when talking to my advisor.

For example in my language there is an idiom "somebody broke the bank" which means that we've run out of something. So one day I told my advisor, that I can't start a new culture (some eucaryotic cells), because somebody broke the bank, meaning that there are no samples left in the big freezer. And she understood that somebody managed to literally break the liquid nitrogen cell bank. That was a loooong day for me.

Oh, and when your supervisor tell you to do something the fast way, by omitting some safety - don't do that. A friend of mine got poisoned with benzoic acid this way.

I had it right in the actual program where it mattered, but when I was giving my talk on it, I was asked about the time-step in a Bortz-Kalos-Lebowitz Monte Carlo simulation, and I said it was the inverse of the occurrence rate of the selected event instead of the inverse of the sum of the occurrence rates for all events.

To be more concrete - if there were 100 things that could happen, and each one would happen on average every 5 seconds, the BKL algorithm would pick one and then advance time by 5/100 seconds. I said it would advance time by 5 seconds.

For some reason, people took me at my word and concluded that the algorithm was prone to occasionally having nothing at all happen for extended periods of time (which is precisely what it's made not to do), not to mention scaling improperly with system size.

Advice from a math grad student.

Before you invest substantial time in a research project, make triply sure nobody has solved the problem already. Often there's an easy but time-consuming way to check this: if there's a paper that such a result is guaranteed to cite (such as the paper in which the problem was first posed), then read the abstracts of all 100+ papers citing it on Google Scholar or something. If there are a few such guaranteed citations, do this for all of them. If something looks vaguely related (most of the papers won't be), then skim through the whole paper.

I've been bitten by this 1.5 times. The first time around was my very first research project in graduate school; I assumed that the professor posing it had already performed this check. Once we had actually proven a result, we discovered that the same result was proven 14 years ago. Moreover, a few years later, the same author had solved the extension we were going to pursue next.

The second time around, I read a Wikipedia article which stated that no improvements to an upper bound were found. This was for semi-legitimate reasons because it required putting together two results in a trivial way to see that an upper bound already exists. By the time I realized this, we were already on the writing-a-paper stage; fortunately, our upper bound was better than the existing one, so we were saved.

I won't be able to respond individually to everyone, but thank you all for your contributions! If anything else comes to mind, please leave more quotes -- I'll check back periodically.

[-][anonymous]9y10

I once spilled lactic acid because I just assumed it flowed as fast as water. It would have been a simple mistake, but I specifically obtained that bottle from my twin sister who had prepared a similar mounting medium with it AND warned me to use it sparingly because buying it takes half a workday, ..., and I had listened and promised to be careful. I could have her show me how to do it, but she was time-pressed and irritable because her supervisor demanded she present her thesis before the institute (something like a preliminary hearing). Looking back now, I realize that asking her help then would have only taken five minutes, and there were literally no people who would be more willing and able to go out of their ways to do it.

I think asking yourself 'does it really cost so much to approach this person for advice?' is a good heuristic to have.

It's not quite what you are asking for, but speaking of mistakes in labs I recommend http://pipeline.corante.com/archives/things_i_wont_work_with/ :-)

http://www.buzzfeed.com/kmallikarjuna/how-to-science-as-told-by-17-overly-honest-scientists

I'm not sure how much of this is mistakes, exactly, but it's definitely about sub-optimal experiments. Sometimes way sub-optimal.

Drawback: there is no way to verify this material.