# P values and effect size.

In this issue, the Research Report entitled "Travel in Adverse
Winter Weather Conditions by Blind Pedestrians: Effect of Cane Tip
Design on Travel on Snow," by Kim, Wall Emerson (the author of this
Statistical Sidebar), and Gaves makes note of something called an effect
size. This concept is increasingly important for experimenters, writers,
and readers to understand. But before we talk about effect sizes, let us
take a step back and look at significance levels. Without getting into
the nitty gritty of hypothesis testing, what any statistical test is
trying to determine is whether, given the data at hand, the results are
more or less likely to be due to chance or to some real underlying
cause-and-effect situation. It is arguably extremely difficult to prove
absolutely that one thing caused another thing to happen, but we can
test a hypothesis and establish what the likelihood is that what we
think is happening actually is happening. In the social sciences, we
typically accept a 95% level of certainty as sufficient, which means
that we will accept a 5% chance that our results are just a fluke. The
level of certainty is expressed in the reporting of a statistical
test's significance level or p value. If the p value for a given
statistical test is less than .05 (which is often used as the accepted
level for significance), we say the result of the test is
"statistically significant." The fact that the p value is less
than .05 means there is less than a 5% chance that the result was just
due to random error in the data. Of course, the results might be caused
by some consistent effect that the researcher has neglected to control,
but that is a complication best left for a future Statistical Sidebar.

As I indicated earlier, for many years, authors have reported the results of statistical tests with the accompanying p value and left it at that. However, there is a growing call for authors in all fields to also report effect sizes because the significance level can often be misleading. The p value is a function of the mathematical calculations used to arrive at the result of the statistical test. As such, factors such as how big the sample is, the assumed distribution of the data, and the type of test being done can all affect the p value. Most notably, if everything else is kept the same, but the number of observations or data points included in a statistical test increases, the p value will generally decrease. All things being equal, therefore, if you increase the amount of data being used to run a statistical test, you are more likely to have a statistically significant result. Since the p value is sensitive to this sort of mathematical manipulation, also reporting effect size can contribute a better understanding to the result of a statistical test.

Effect size is a measure of how large of an effect is revealed in the data being analyzed. Effect size is not dependent on sample size, so it is not perturbed like p value by increasing the number of observations for a statistical test. Although a p value indicates the statistical significance of a result, the effect size is indicative of the actual magnitude of the effect being tested. One way of looking at it is to ask whether the result of a statistical test reports a finding that is meaningful in the real world. A statistically significant finding might be indicative of a tiny difference between two conditions. In such a case, the effect size would probably be small. That tiny difference between the two conditions, while leading to a statistically significant result, does not reflect a large effect on a practical level. Although we do still need to continue to report p values to indicate whether statistical results are likely due to chance or not, we also need to report effect sizes so that we can correctly interpret those p values and assign any practical meaning to them.

Robert Wall Emerson, Ph.D., consulting editor for research, Journal of Visual Impairment & Blindness, and professor, Department of Blindness and Low Vision Studies, Western Michigan University, 1903 West Michigan Avenue, Kalamazoo, MI49008; e-mail: <robert.wall@wmich.edu>.

As I indicated earlier, for many years, authors have reported the results of statistical tests with the accompanying p value and left it at that. However, there is a growing call for authors in all fields to also report effect sizes because the significance level can often be misleading. The p value is a function of the mathematical calculations used to arrive at the result of the statistical test. As such, factors such as how big the sample is, the assumed distribution of the data, and the type of test being done can all affect the p value. Most notably, if everything else is kept the same, but the number of observations or data points included in a statistical test increases, the p value will generally decrease. All things being equal, therefore, if you increase the amount of data being used to run a statistical test, you are more likely to have a statistically significant result. Since the p value is sensitive to this sort of mathematical manipulation, also reporting effect size can contribute a better understanding to the result of a statistical test.

Effect size is a measure of how large of an effect is revealed in the data being analyzed. Effect size is not dependent on sample size, so it is not perturbed like p value by increasing the number of observations for a statistical test. Although a p value indicates the statistical significance of a result, the effect size is indicative of the actual magnitude of the effect being tested. One way of looking at it is to ask whether the result of a statistical test reports a finding that is meaningful in the real world. A statistically significant finding might be indicative of a tiny difference between two conditions. In such a case, the effect size would probably be small. That tiny difference between the two conditions, while leading to a statistically significant result, does not reflect a large effect on a practical level. Although we do still need to continue to report p values to indicate whether statistical results are likely due to chance or not, we also need to report effect sizes so that we can correctly interpret those p values and assign any practical meaning to them.

Robert Wall Emerson, Ph.D., consulting editor for research, Journal of Visual Impairment & Blindness, and professor, Department of Blindness and Low Vision Studies, Western Michigan University, 1903 West Michigan Avenue, Kalamazoo, MI49008; e-mail: <robert.wall@wmich.edu>.

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Statistical Sidebar |
---|---|

Author: | Emerson, Robert Wall |

Publication: | Journal of Visual Impairment & Blindness |

Date: | Jan 1, 2016 |

Words: | 707 |

Previous Article: | Burns Braille Guide: A Quick Reference to Unified English Braille. |

Next Article: | Editor's page. |