Epistemic Status

Preregistering an experiment, not very confident in the design.

Because it is important to decide how the data will be analyzed before collecting it, I'm posting my planned experiment here, may modify the design based on feedback (I'll post an update if I do), and then I will start the experiment, and finally I'll post the results when it's finished.

All feedback is appreciated!

Introduction and Goals

I often suffer from fatigue in the early evening that leads to anxiety later at night, which makes it difficult to sleep. The purpose of this experiment is to determine whether eating a small amount of chocolate to counteract fatigue in the early evening affects my sleep, and if so, how. I could easily imagine the effect on sleep being either beneficial (counteract the pattern of fatigue and anxiety that makes it hard to sleep) or harmful (the theobromine might keep me awake).

Because I'd rather not add the fat and sugar that comes with chocolate to my diet, the effect on sleep needs to be appreciable for it to be worthwhile. I have fairly arbitrarily decided that the chocolate is worth it iff it causes me to fall asleep at least 15 minutes earlier than I otherwise would.

Experimental Protocol

Every work day evening when I go directly home from work and do not have a social event planned, I will get home, make any necessary adjustments to the air conditioning (temperature affects my sleep a lot, and I want to make sure I don't accidentally bias the results by setting the AC differently depending on whether I have chocolate), and then with probability 1/2 (decided by a coin flip or die roll) eat a square of dark chocolate.

I will use a spreadsheet to track which nights I did and did not eat a square of chocolate, and I will use sleep data from a Fitbit Blaze smartwatch to record what time I fall asleep each night. The Blaze uses heart rate and movement data to decide what time I fall asleep (I don't need to actively tell it when I go to bed). In my experience, the time it says I fell asleep generally matches my subjective memory of when I fell asleep (not when I went to bed).

Statistical Analysis

I care about the magnitude of the effect, not just the direction, and my goal is to make the correct decision rather than to get published, so testing whether the observed effect is statistically significantly different from 0 is not very useful. While it would be a mistake to keep going until I get the result I want, if the results are close (in either direction), it will be worth gathering more data.

Because I have arbitrarily picked an absolute effect size of -15 minutes (I fall asleep 15 minutes sooner than I otherwise would), I will be focusing on determining which side of that cut-off the effect is. To do so, I plan on analyzing the data as follows:

1. Run the experiment until I have used up one bag of chocolate (15 individually-wrapped squares).
2. Calculate the standard error of the difference of sample means (I am not assuming equal variance in the two samples).
3. If the observed effect is at least one standard error away from the -15 minute cut-off (my sleep is inconsistent enough that I expect this condition probably won't be met at this point), stop the experiment.
4. Otherwise, keep going until either the stop condition from step 3 is met or the standard error is less than 5 minutes. If I end with a standard error less than 5 minutes and the observed effect is within 5 minutes of -15 minutes, I will consider the experiment inconclusive.
New Comment
8 comments, sorted by Click to highlight new comments since:

I might suggest doing all of the chocolate days in a row, since it might be the case the consistently eating chocolate before bed has an effect that inconsistently eating chocolate does not have.

Then again, that might open the experiment up to a skew -- maybe it will be rainy for a week that's contained entirely in the no-chocolate timezone and that will throw off the results.

I welcome any thoughts on this.

Edit (after a conversation below): By the way, I love that you've thought to a) do this experiment and b) pre-register it publicly! When I saw this in my feed I thought "aha, that person is being a good scientist".

There are a lot of mechanisms one could hypothesize and test if there is, indeed, any significant effect to explain/refine. Doing an initial simple experiment to show that something is here and it's worth diving deeper (to figure out timing, dosage, multi-day effects, etc.) is the right thing, and I applaud the poster for writing down their plan before starting.

This comment reads a bit like you think I was attacking the poster. Did I come off that way?

Edit: In particular, my comment was a response to:

All feedback is appreciated!

and it's possible you saw in my comment the phrase:

[...] and that will throw off the results.

but didn't notice this (or didn't parse it the way I intended it to be parsed):

maybe it will be rainy for a week that's contained entirely in the no-chocolate timezone and that will throw off the results.

Yes, I did take this a bit too harshly. Maybe not an attack, but a criticism/objection that I feared would be discouraging more than helpful. I may have over-indexed on it a bit because it's something I am working on for myself, and I may have projected it onto you. My natural tendency is to focus on complexity and difficulty that will have to be resolved, rather than supporting and reinforcing the good parts, and I
apologize if I misattributed your suggestion.

I doubly apologize that I did exactly the thing to you that I feared you were doing to OP - your suggestion is good that consistency might be important for the effect, and the warning is also good that losing the randomization may let more outside correlations creep in.

No worries! I think I also need to work on my tone, as I sometimes point out tiny little details that I think could be improved without pointing out my overall positive feelings. I've done this pretty extremely here, so I'm going to go back and edit my original comment so that it more accurately reflects how I feel. Thanks!

Hi, did anything come out of this experiment?

You already have lots of fitbit sleep data, right?

You should look at that data first and use it to guide the experiment. Should you study other metrics of sleep quality?

In particular, you should determine the standard deviation in onset times and do a power analysis to see how long the experiment has to run. I guess there's a problem that you might not have labels indicating which days are straight home from work days. You should try to remember that. Even without that, a simple filter like Mondays might be a good choice.

This is pretty interesting.. any outcome you can share? (I'll bug you about this next time I see you in person so you can just tell me then rather than responding, if you'd like)

Good idea to just use the time you fall asleep rather than the sleep stage tracking, which isn't very accurate. I think the most interesting metric is just boring old total sleep time (unfortunately sleep trackers in my experience are really bad at actually capturing sleep quality.. but I suppose if there's a sleep quality score you have found useful that might be interesting to look at also). Something else I've noticed is that by looking at the heart rate you can often get a more accurate idea of when you feel asleep and woke up.