Back to [Archive] Withdrawn submissions

[ODP] Do exercise games increase cognitive ability? a reanalysis of Stanmore et al
Do exercise games increase cognitive ability? a reanalysis of Stanmore et al (2017)

Stanmore et al. (2017) meta-analyzed 17 studies examining whether electronic games that involved exercise increased cognitive ability. They estimated the effect size to be 0.44 d. The data from their study were reanalyzed, and the evidential basis for the claim was found to be much weaker than what the authors concluded. Taken as a whole, the literature is consistent with an effect size of 0, but it is not possible to be sure because the published studies used too small samples.

6 pages.

I plan to add more meta-analytic tests, but it slow going because I could not find any good implementations, so I have to write my own. I'm building a set of related meta-analytic functions that can then be widely applied to different datasets at ease.

So far I have covered:

- forest plot
- funnel plot
- p-curve + binomial test
- test of insufficient variance

To be added:

- some more funnel functionality
- more forest functionality e.g. cumulative meta-analysis by year/effect size
- z-curve
- p-uniform
- R-index
- more?

These will be available under the meta_ prefix in my package. Once the functions are reasonably well-tested and stable (e.g. 1 year after use), I will move them to their own package.

All of the above work automatically when applied to an rma object. This is the most commonly used meta-analytic package in R as far as I know, and thus the prime target for building on top of.
This being my first review for ODP, it may be a little more formal of a review than other posts. My apologies if this is a convention here that I am inadvertently violating.

There are a lot of reasons to be concerned about the Stanmore meta-analysis. Unfortunately, the reasons presented here in this re-analysis do not touch on the major issues, and misstate some of the other issues. I will start with the major issues with the re-analysis, and then go on to finer points.

Major issues:
I think the major issue here is not with the data, which the authors do a great job reconstructing from the data (as the near perfect accuracy in meta-analytic results, to 2 decimal places). The problem is the analyses chosen and the interpretations thereof.

First, the funnel plot does NOT show a relationship between effect size and standard errors (the correlation being non-significant).
Second, the trim-and fill: I am glad the authors discussed the issue of the different estimators, but was saddened they did not go into further detail. The appropriate use of the trim and fill is as follows (from Duval & Tweedle, 2000; also Duval, 2005: 1) ignore the q0 estimator. 2) the l0 and r0 estimators can lead to different results. They are appropriate in different situations. Unless there is large a priori reason for choosing one, do the following: run both. If they give the same answer, you are set. In this instance, they do not. The issue is while the l0 operator shows no evidence of pub bias, the r0 does AND imputes 11 missing studies (into the original 17). As this is substantially larger than the 0-25% imputation the r0 is used for, the recommended practice is to use the results from the l0 operator.

This is the reason the Egger test and Schimmack tests confirm no significant publication bias, there isn't any.

The p-curve is a nice addition, but it is incomplete. The statistical tests accompanying the p-curve must be reported (both half and full tests). Furthermore, the statement 'given the small sample size, this cannot be taken as strong evidence of a true effect" is not how p-curve analysis works either. If you wish to include p-curve, please do so in a systmeatic way that discusses the properties and statistical results of the p-curve.

With all of this in mind, the conclusion the authors draw is incorrect. Contrary to the first sentence of the discussion (p. 4); there are no signs of publication bias, the p-curve 'evidential base' is not reported or backed up with statistical evidence.

To be honest, I have looked into each study in the Stanmore meta-analysis. The major problem with it is they do not use any measure of intelligence in any study. The closest some studies come is a small battery of executive function tests. When studies report 'global cognition', in nearly every single study the data they are analyzing is the Montreal Cognitive Assessment (MoCA), a screening tool for cognitive impairment and dementia. The MoCA is on a 1-30 point scale, with 26-30 being 'normal functioning'). This is clearly not a meta-analysis on cognitive ability. I would recommend a deeper look into the constituent studies to make a proper counter-argument against the Stanmore results.

Minor issues:
The introduction is a little misleading. The Spitz work cited is about remediating mental retardation, and all of it's problems. It does not belong as a critique on training regimen in the elderly. A more appropriate critical essay to cite would be the cognitive training analysis by Dan Simons (2016).

The fadeout effect as described in Protzko, 2015 occurs because the intervention group loses their gains, not because the control group catches up. Furthermore, as argued in both Protzko 2015 and 2016 regarding the fadeout effect, the impermanence of effects says nothing about the validity of the intervention effects.

The statement on p. 2 that starts "For whatever reason, the authors..." borders on a personal attack. I would suggest revising.

Duval, S., & Tweedie, R. (2000). Trim and fill: a simple funnel‐plot–based method of testing and adjusting for publication bias in meta‐analysis. Biometrics, 56(2), 455-463.
Simons, D. J., Boot, W. R., Charness, N., Gathercole, S. E., Chabris, C. F., Hambrick, D. Z., & Stine-Morrow, E. A. (2016). Do “brain-training” programs work?. Psychological Science in the Public Interest, 17(3), 103-186.

Thanks for the review. I will revise.
Withdrawn due to lack of time and less relevant now that more studies have come out reanalyzing these.