A side question, prompted by an amusing factoid in the Hernan paper: "...we restricted the population to women who had reported plausible energy intakes (2510 –14,640 kJ/d)".

In the statistical analysis in this paper, and also as a general practice in medical publications based on questionnaire data, are there adjustments for uncertainty in the questionnaire responses?

When you have a data point that says, for example, that person #12345 reports her caloric intake as 4,000 calories/day, do you take it as a hard precise number, or do you take it as an imprecise estimate with its own error which propagates into the model uncertainty, etc.?

Keyword is "measurement error." People think hard about this. Anders_H knows this paper in a lot more detail than I do, but I expect these particular authors to be careful.

This issue is also related to "missing data." What you see might be different from the underlying truth in systematic ways, e.g. you get systematic bias in your data, and you need to deal with that. This is also related to that causal inference stuff I keep going on about.

Open thread, Dec. 21 - Dec. 27, 2015

by MrMind 1 min read21st Dec 2015233 comments


If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.