A long long time ago, towards the end of 2024, me and two friends ran a self-experiment on the impact of glycine on sleep quality. Life has been busy, but time has finally come to publish the results. Science is hard continues to be the motto of this endeavor, the long and short of it is that we weren’t able to measure any effect.
I feel quite cautious about publishing biometric data, but if there is some cool science you want to do to it, email me and maybe we can figure something out.
Previously, on Science is Hard
The full experimental setup is described here. TLDR: three of us were on the same schedule of alternating runs of glycine (3g) and control, which we took before bed. Each day, we measured a set of quality of life (QoL) metrics.
The hypothesis was that glycine improves sleep quality, which we’d be able to capture with the QoL metrics. We did runs in order to see if there were any tolerance effects.
Data sanitation is a pain
Even though there were only three of us, getting the data into a nice format for analysis was surprisingly gnarly, for the following reasons:
We were tracking “time to bed” in the morning of the next day, so that we could estimate when we actually went asleep (and not use the phone right before falling a sleep – that wouldn’t even work). But other metrics like “restedness in the morning” we would track in the moment. So any day had a mix of datapoints for that day and the day before. Also, sometimes we went to bed past midnight, which essentially meant that I had to write a custom date resolver.
One of us traveled into a different timezone during the experiment, but the “time to bed” and “time out of bed” fields did not store a timezone.
We used different data tracking apps between ourselves, to reduce friction during capture (and increase the number of datapoints captured). Exports from these apps had to be transformed separately (despite my best efforts at standardization).
QoL metrics
First, let’s take a look at the QoL metrics themselves and see whether they make sense. Ignoring the impact of glycine on sleep, let’s just look at the impact of sleep on everything else:
Already, we see a few issues:
Even combining data from all three participants, the confidence intervals on correlation are pretty wide.
The QoL metrics, which were meant to serve as a proxy for sleep quality, mostly don’t robustly correlate with hours slept! The only exception is “Restedness upon waking”.
Looking at everyone’s data individually, we see that Errormargin’s data follows the “insomniac pattern”: Productivity is decreasing with hours slept. Randomator’s data, OTOH, follows the “hulk pattern”: with hours slept increasing irritability. Not very science, but fun.
If we look at the correlations between the metrics, we see that there isn’t much.
The strongest correlation is between how rested we felt and how productive we felt (0.34). The correlation between restedness upon waking and restedness throughout day was at 0.2, lower than I expected[1].
My biggest takeaway here is updating on how unreliable self-report metrics are. The following analysis will have to disregard everything except “restedness upon waking”, unfortunately. I’ll still show all the metrics[2].
What impact does glycine have on QoL metrics?
In analysing these data, I ran two experiments on myself. First, I didn’t initially reveal to myself which set of data was glycine and which one was control. Second, I initially grouped the data into two groups entirely randomly, to get a sense for how different the real results felt.
Here is the data according to a random grouping:
Here is the data grouped by control and intervention. Can you guess which one is which?
Glycine is blue, but I think that even without going into the statistical tests it’s pretty clear that we weren’t able to measure much. I did run a t-test on the metrics, however.
p<0.05 for rested_upon_waking! We did real science! Of course, this makes me weary. Also, I don’t want to trust a statistic if the data itself doesn’t look convincing.
Takeaways
Self-report metrics suck, we struggle to tell through introspection alone why we feel the way we do.
We need to harmonize data collection or I’ll go insane the next time we do this.
Much more data needed, but maybe there’s something there directionally? I’m inclined to not give up on this, given how much success others have had with glycine. But I’m more pessimistic about glycine than I was starting out.
Hopefully we’ll run another study, with more data, in the near future.
Science™ insight: How rested you feel in the evening is not entirely determined by how rested you felt in the morning, but also by things you did throughout the day! Give me that Nobel prize. ↩︎
First, because it’s interesting to see what noisy data can look like once nice regression lines have been drawn through it. Second, because I’ve already made the plots and I want to publish before the end of the year. ↩︎
A long long time ago, towards the end of 2024, me and two friends ran a self-experiment on the impact of glycine on sleep quality. Life has been busy, but time has finally come to publish the results. Science is hard continues to be the motto of this endeavor, the long and short of it is that we weren’t able to measure any effect.
I feel quite cautious about publishing biometric data, but if there is some cool science you want to do to it, email me and maybe we can figure something out.
Previously, on Science is Hard
The full experimental setup is described here. TLDR: three of us were on the same schedule of alternating runs of glycine (3g) and control, which we took before bed. Each day, we measured a set of quality of life (QoL) metrics.
The hypothesis was that glycine improves sleep quality, which we’d be able to capture with the QoL metrics. We did runs in order to see if there were any tolerance effects.
Data sanitation is a pain
Even though there were only three of us, getting the data into a nice format for analysis was surprisingly gnarly, for the following reasons:
QoL metrics
First, let’s take a look at the QoL metrics themselves and see whether they make sense. Ignoring the impact of glycine on sleep, let’s just look at the impact of sleep on everything else:
Already, we see a few issues:
If we look at the correlations between the metrics, we see that there isn’t much.
The strongest correlation is between how rested we felt and how productive we felt (0.34). The correlation between restedness upon waking and restedness throughout day was at 0.2, lower than I expected [1] .
My biggest takeaway here is updating on how unreliable self-report metrics are. The following analysis will have to disregard everything except “restedness upon waking”, unfortunately. I’ll still show all the metrics [2] .
What impact does glycine have on QoL metrics?
In analysing these data, I ran two experiments on myself. First, I didn’t initially reveal to myself which set of data was glycine and which one was control. Second, I initially grouped the data into two groups entirely randomly, to get a sense for how different the real results felt.
Here is the data according to a random grouping:
Here is the data grouped by control and intervention. Can you guess which one is which?
Glycine is blue, but I think that even without going into the statistical tests it’s pretty clear that we weren’t able to measure much. I did run a t-test on the metrics, however.
p<0.05 for
rested_upon_waking! We did real science! Of course, this makes me weary. Also, I don’t want to trust a statistic if the data itself doesn’t look convincing. TakeawaysHopefully we’ll run another study, with more data, in the near future.
Science™ insight: How rested you feel in the evening is not entirely determined by how rested you felt in the morning, but also by things you did throughout the day! Give me that Nobel prize. ↩︎
First, because it’s interesting to see what noisy data can look like once nice regression lines have been drawn through it. Second, because I’ve already made the plots and I want to publish before the end of the year. ↩︎