Sorted by New

# Really thorough statistical analysis of Anki (flashcard app) data

rpubs.com/rain8/1100036 Its a work in progress with only two steps finished. Not exactly an addon because its in R not Py. So far the project does many little things like find bugs in user’s collection, describe the growth of their collection and text mining. Ultimate goal is to hopefully be able to use anki as continuous cognitive tester and allow users to learn about and optimize their memorization process. Instructions to run on your own data : github

I am not sure data in anki could really be used as a continuous cognitive health test. Probably requires removing lots of artifacts and other influences and then finding outside influence that definitely relates to cognition.  Lit review.

Relevant quote from Dragonfired by J. Zachary Pike. "Brokers make money by knowing key information; they make fortunes by ensuring that other brokers remain unaware or unsure of the same information until after critical trades."

In ggplot (R statistical language) the defaults include a subtle grid and no axes. They also put in the extra random space.

Here is some code in case someone else using R wants to try out things discussed here:

library(ggplot2)
qplot(wt, mpg, data = mtcars, colour = factor(cyl)) +
theme(axis.line.x = element_line(color="black", size = 0),
axis.line.y = element_line(color="black", size = 1)) +
scale_x_continuous(expand = c(0, 0), limits = c(0,8)) +
scale_y_continuous(expand = c(0, 0), limits = c(0,36))

Might be able to use Multi-Armed Bandit-like sampling for this, even? Hm…

Effects may take time and may require time to build up to detectable levels. This is why Winters increased the length of each intervention till they lasted some weeks. If the placebo causes a different self report rating then its a bad placebo and should be Blinded out but if it causes a psychological improvement then why not use it?

so non-X days will be more likely measured as being high in X-effect. But that'd mean that X days are more likely followed by non-X, which with random order is not the case.

Yes but it will still make the effect size much less.

Could you elaborate on this a bit

Lag and build up is mentioned above. Training effect is when you get better at something just by doing it, so later interventions look better.  At the same time there may be drift of self report. In other words effect of slowly growing change on memory making user think there is no change. For all these reasons plot the time series with time on X results on Y and make each point the color of intervention or placebo. Do not connect the dots with lines but do make a smooth loess-like line. You will be able to see some of the issues if they occur.  Some more on all the issues.

The more important an effect is usually the stronger it is so starting many of the experiments but for a short time might yield results much faster. May be possible to overlap the non blinded experiments and run many at the same time with varying periodicity so the same interventions do not always happen on top of each other.

Your statistical method is similar to two sample t test right? Well that does not account for several possible issues of time series and dependence between data points of one variable. Lag and training effects for example. So be sure to control all other possible independent variables and  plot the data timeline and when you do do not connect data points with lines!

In all experiments, I will be using the statistical method detailed here, code for it here, unless someone points out that I'm doing my statistics wrong.