Fri, 04/01/2011 - 17:01

Today on my schedule is "data analysis day", but actually a lot of it has been meeting with people. The data analysis part of it comes down to tracking down the source of a bad result on a linear model our team's been trying to run. We were concerned the Hauck-Donner effect, which I have never heard of before, might be happening to us, but it turned out to be something simpler: a new variable that had been added to the model has NA values except when the dependent variable is equal to "No". So when we added the new variable and did na.omit(), the data became impossible to model.

An update on tuesday's post: the cleanup operation isn't actually run after the page is done being served - MediaWiki doesn't support that, though I thought it did - so it's run at the end of the process, and on rare occasions some poor user might have to wait for it to sift through everything. Wiki administrators have the option to run it every night by cron instead. But I think it makes sense to provide a default way of doing it that doesn't require extra installation steps.

