The brain replays memories stochastically during sleep, and replays them OOMs faster than they were experienced. This "multi epoch training" allows the brain to learn much more from the environment, and it can prioritise salient experiences.

[-]Alice Wanderland11mo32

Perhaps? I'm not fully understanding your point, could you explain a bit more what I'm missing - how does accounting for sleep and memory replay add to the point of comparing the pretraining dataset sizes between human brains and LLMs? At first glance, my understanding of your point is that adding in sleep seconds would increase the training set size for humans by a third or more. I wanted to make my estimate conservative so I didn't add in sleep seconds, but I'm sure there's a case for an adjustment adding it in.

[-]Nathan Helm-Burger11mo43

Yeah, I don't think it makes sense to add sleep if you are estimating "data points", since it's rehearsing remixes of the data from awake times.

On the other hand, if you are estimating "training steps", then it does make sense to count sleep. Just as you'd count additional passes over the same data.