Friday, July 28, 2017

Perverse Incentives from the New "p < 0.005" Proposal

Here is a short addendum to my post of yesterday, commenting on why I think that the proposal recently mooted to require a p < 0.005 for "statistical significance" is misguided and likely problematic.

There is another problem with that (admittedly well-intentioned) proposal, which arises because wide-spread implementation of the proposal would create a perverse incentive which could easily degrade the quality of the scientific literature by much more than it improves it.

As the authors acknowledge, lowering the alpha needed to declare "significance" would require larger sample sizes, and so data collection would entail greater difficulty and expense. They estimate that samples would need to be 70% larger - that may well be for studies of a single variable, though research that involves interaction effects would likely need a greater increase in the amount of data. Regardless, the manpower and expense would increase notably. (The authors see this as a potential benefit, in that "considerable resources would be saved by not performing future studies based on false premises.")

However, this extra work and expense creates a perverse incentive for researchers. The more you pay (in time, effort, money, etc.), the more you want to recoup your costs by producing publishable research. The harder such results are to produce, the harder you will look for them, after you have sunk the costs of getting the data.

Now, this is not a bad thing, if there is no way to cheat unwittingly. (Let's ignore entirely the possibility of fraud.) But if it is still possible (as it is today) to unknowingly p-hack, traipse through the garden of forking paths, HARK, etc., raising the cost of gathering a dataset will inevitably lead to a rise in p-hacked, forking pathed, and HARKed results.

The very change that was meant to improve the quality and integrity of the research literature will rather act to degrade it.

It seems to me that this underscores the fundamental need to address how incentives interact with methodologies and standards. Otherwise we are, at best, spitting in the wind, and at worst, using gasoline to put out the fire.

No comments:

Post a Comment