Even just pairing up and running through these items for the 1-2 most important goals of the month might be quite a big boost. I would be up for trying that / organising something around that (UK time-zone).
I wanted to pick on on the point about heritability.
I've been struggling to explain to others that despite self-control having a high genetic component (you say a heritability of about 60%) it's still possible and valuable to improve it significantly. You analogy with strength training was a really useful framing. The heritability of BMI/strength is about the same  as for self-control.
I guess the difference is that people don't join clubs specifically to train their self-control for 1h three times a week, with experts to guide them along the way and make sure the difficulty level and progression is right for them. It would be great if this existed though. I've played with the idea with a friend 11 years ago, we would have a list of mostly pointless skills that we would train, for the sake of training our self control, every day. It probably wasn't the best format, and I have no idea if it was causal but what followed was the most productive period of my life. We also ran a much watered down version at some of the LessWrong London meetups a long while ago with less (but still some) success.
It makes me want to try something similar again. Because I think the benefits of improving self-control are huge, Even if 'all' you end up with is a solid model of your own motivation for a task and how it will changes with situational changes (e.g. relative motivation of studying at home vs. the library), then it would still be hugely valuable.
 https://onlinelibrary.wiley.com/doi/abs/10.1002/gepi.20308 "Additive genetic factors explained 81% of variation in height, 59% in body mass index and 50–60% in the strength measures ... a study of one million Swedish men"
Good idea - here you go. https://docs.google.com/spreadsheets/d/1CrdJ3KwWWsDomgRqwSOn96Ix58VHUz88YQhAfiGQuBI/edit?usp=sharing.
I've found micro tracking of my time over a week long period really beneficial. I don't think the same would be true for diet as there is so many variables so look at, picking any one thing to look at would be privileging that hypothesis to an unreasonable degree. That said if there is something specific you are looking for, where you would expect short term fluctuations to be important, then I could see the value.
I suppose I would also see the value if you were running a trial on yourself. Toss a coin at the start of the week if heads each X every day, if tails then don't. Measure something you expect to change over such a short period of time (e.g. sleep, subjective energy levels)
I find Cochrane reviews to generally be of good quality, even if that means there findings are very often "we have reviewed all available data as of X year and we not able to draw any clear findings". Depending on your technical knowledge it might be useful to point out that they generally include a "plain language summary".
It is important to note that not everyone agrees with their findings (the abstract of https://pubmed.ncbi.nlm.nih.gov/16052203/ is well worth reading, not just as a criticism of Cochrane but as a comment of the field of research in general). I suppose one could reasonably argue that a combining a load of crap observational or small RTCs (with high drop out / low protocol adherence) is not going to teach you very much, yet this is what systematic reviews of the field tend to do.
You do occasionally see nice articles like https://pubmed.ncbi.nlm.nih.gov/32144378/ which follow 142 people for 2 years. It's still not a lot of people and not that long to draw conclusions for many different lifestyles and cultures across the entire human lifespan and across the world.
See https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD002128.pub4/full for an example of a Cochrane review, and for a list of advice see https://nutrition.cochrane.org/evidence
Or see this almost comical example, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7427685/ "12,133 records identified, 30 studies met inclusion criteria ... despite the large number of trials included in the review ... evidence for primary prevention on clinical endpoints is limited to one large trial with methodological issues ... [hence] there is still uncertainty regarding the effects of a Mediterranean-style diet on clinical endpoints and cardiovascular disease (CVD)"
I also second peoples point about examine.com - it's good.
Also, it's not really directly answering the question but I want to rant. Calories-in must equal calories-out but in a way that has so many caveats that it's not really a useful measure of anything.
I'm really interested in this too. I have a 1 year old and work in improving engineering education.
https://en.wikipedia.org/wiki/Philosophy_for_Children might be worth checking out.
I'm also a big fan of this, I have got huge mileage out of creating a single page timeline of 1600 - 1800. I've got a few books lined up to create 1800-2000 and 1400-1800 but they are unfortunately low on my priority list at the moment. I would highly recommend it - what was happening in the world when the first academics journals were published. And 16-1800 is such a fascinating time, the scientific and industrial revolution, the age of enlightenment, the colonial empires and world trade.
The other one I have found a lot of value in is reading through cochrane/cambell reviews (high quality meta studies with readable summaries). There is a summary list of some useful ones here (I can't remember who I got it from though, but thanks whoever you are!) https://docs.google.com/spreadsheets/d/19D8JUgf95t-f-oUAHqh8Nn2G90KO3gUiua9yAjBSSqI/edit?usp=sharing
Yes we tracked time, but only in an aggregate way. Our list of work-tasks had a very rough estimate (XS, S, M, L, XL - each being about twice the size of the previous, and XL being just more than we could complete in a 2 week period). When we came to plan our 2 weeks of work we estimated hours using 'planning poker' (which is a bit like the delphi method - blind estimates by each member of the team, followed by a brief discussion of the reasons for the differences, followed by one more round of blind estimates, then I [as team lead] had the final decision). At the end of the 2 weeks we would talk about each item, this sometimes involved a discussion of the amount of work relative to the estimate (either the initial e.g 'S' or the hours e.g. 4). In our discussions for the tasks people would regularly refer back to previous tasks as reference. We would always talk about our productivity (i.e. the size of the tasks we completed, where XS=1, S=2, M=4 ...) but this was a balancing act, it would be easy to mess up incentives here.
We spent 4h every 2 weeks planning tasks, but that might involve a small discussion/argument over what should be part of each task, not just the estimation. We would also spend 2h and the end of the 2 weeks reflecting on 1) what we had made and how it impacts the development roadmap 2) things to increase MPH (motivation, productivity, happiness - I was far too pleased with myself for coming up with that acronym :P ).
Individually, I thought a lot about how to increase development speed and accuracy in estimates. But that was at least a third of my role in the latter stages, the rest being split with planning the development roadmap and doing actual development.
For a 3h task, most of the time we would spend ~2min listing to one person describe what it is. Then <1min for everyone to show their card with their estimate of the number of hours. Often that was in enough agreement that we wouldn't do anything extra. We did have a few where one person guessed 3h and another guessed 20h, that often resulted in a 10min discussion, as there was clearly a disagreement on how to do that task 'properly').
Thanks, I'll try to write up that post in the next couple of weeks.
In my old software dev team we got very good at estimating the time it would take to complete a single work-package (item on the backlog) but those were at most a couple of days long. What we were not very good at is the estimation of longer term progress, in that case we were in a start up and I think that was unknowable due to the speed at which we would change plans based upon feedback.