Statistical significance testing and p-values: defending the indefensible? A discussion paper and position statement

Much statistical teaching and many research reports focus on the ‘null hypothesis significance test’. Yet the correct meaning and interpretation of statistical significance tests is elusive. Misinterpretations are both common and persistent, leading many to question whether significance tests should be used at all. While most take aim at the arbitrary declaration of p<0.05 as a threshold for determining ‘significance’, others extend the critique to suggest the ‘p-value’ should be dispensed with entirely. P-values and significance tests are still widely used as if they give a measure of the size and importance of relationships even though the misunderstanding has been observed and discussed for many years. We argue that they are intrinsically misleading. Point estimates of relationships and confidence intervals, give direct information about the effect and the uncertainty of the estimate without recourse to interpreting how a particular p-value might have arisen or indeed referring to them at all. In this paper we briefly outline some of the problems with significance testing, offer a number of examples selected from a recent issue of the International Journal of Nursing Studies and discuss some proposed responses to these problems. Our paper concludes by offering some guidance to authors reporting statistical tests in journals and presents a position statement that has been adopted by the International Journal of Nursing Studies to guide its’ authors in reporting the results of statistical analyses. While stopping short of calling for an outright ban on reporting p-values and significance tests we urge authors (and journals) to place more emphasis on measures of effect and estimates of precision / uncertainty and, following the position of the American Statistical Association emphasise that authors (and readers) should avoid using 0.05 or any other cut off for a p-value as the basis for a decision about the meaningfulness/importance of an effect. If point estimates and confidence intervals are used then the p-value may be redundant, and can be omitted from reports. When authors talk about ‘significance’ they need to be explicit when referring to statistical significance and we recommend authors adopt the language of ‘importance’ when talking about effect sizes.

10.1016/j.ijnurstu.2019.07.001

0020-7489

Griffiths, Peter

ac7afec1-7d72-4b83-b016-3a43e245265b

Needleman, Jack

6ed963ce-6d89-456a-bcab-0f3e732bde09

Griffiths, Peter

ac7afec1-7d72-4b83-b016-3a43e245265b

Needleman, Jack

6ed963ce-6d89-456a-bcab-0f3e732bde09

Griffiths, Peter and Needleman, Jack (2019) Statistical significance testing and p-values: defending the indefensible? A discussion paper and position statement. International Journal of Nursing Studies. (doi:10.1016/j.ijnurstu.2019.07.001).

Record type: Article