Statisticians issue warning over misuse of P values

Discussion in 'Science & Society' started by Plazma Inferno!, Mar 8, 2016.

  1. Plazma Inferno! Ding Ding Ding Ding Administrator

    P values are commonly used to test (and dismiss) a 'null hypothesis', which generally states that there is no difference between two groups, or that there is no correlation between a pair of characteristics. The smaller the P value, the less likely an observed set of values would occur by chance — assuming that the null hypothesis is true. A P value of 0.05 or less is generally taken to mean that a finding is statistically significant and warrants publication. But that is not necessarily true.
    Misuse of the P value — a common test for judging the strength of scientific evidence — is contributing to the number of research findings that cannot be reproduced, the American Statistical Association (ASA) warns in a statement released yesterday. The group has taken the unusual step of issuing principles to guide use of the P value, which it says cannot determine whether a hypothesis is true or whether results are important.
    This is the first time that the 177-year-old ASA has made explicit recommendations on such a foundational matter in statistics. The society’s members had become increasingly concerned that the P value was being misapplied in ways that cast doubt on statistics generally.
    In its statement, the ASA advises researchers to avoid drawing scientific conclusions or making policy decisions based on P values alone. Researchers should describe not only the data analyses that produced statistically significant results, the society says, but all statistical tests and choices made in calculations. Otherwise, results may seem falsely robust.

    ASA's Paper:
    danshawen and origin like this.
  2. Google AdSense Guest Advertisement

    to hide all adverts.
  3. origin In a democracy you deserve the leaders you elect. Valued Senior Member

    Thanks Plazma! This headline scared the crap out of me since is use ANOVA, F-test and T-test (which utilizes the p-value) in analyzing experimental data. But actually, the report is pointing out the the p-value is still as useful as ever; there are just some very important 'watch outs' when using these analyses. Good article - I will circulate around my engineering group.

    Like we all know (or should) any experiment cannot prove your hypothesis, it can only support it or refute it.
    danshawen and Plazma Inferno! like this.
  4. Google AdSense Guest Advertisement

    to hide all adverts.
  5. Plazma Inferno! Ding Ding Ding Ding Administrator

    Not Even Scientists Can Easily Explain P-values

    P-values, these widely used and commonly misapplied statistics, have been blamed for giving a veneer of legitimacy to dodgy study results, encouraging bad research practices and promoting false-positive study results.
    But the most fundamental problem with p-values is that no one can really say what they are.
    According to this article, most experts on meta-science could tell the technical definition of a p-value — the probability of getting results at least as extreme as the ones you observed, given that the null hypothesis is correct — but almost no one could translate that into something easy to understand.
  6. Google AdSense Guest Advertisement

    to hide all adverts.
  7. iceaura Valued Senior Member

    That's true of most statistical concepts. IMHO the problem is that the human brain is poorly set up to handle probability - in my experience, it's the single most difficult arena of human intellectual effort. And I don't think it's just me as a student or teacher - looking at others, we see that it takes the professionals years of high level study of probability to be able to handle stuff at a level that would be cheap limericks in the poetry field, or cartoons in the field of figurative drawing.

    My own attempts to clarify p values eventually settled on multiple specific examples and repetition over time. One example: say your hypothesis is that after long practice you have acquired the skill of flipping a coin so it lands heads. You test your hypothesis by flipping a fair coin, and you succeed in flipping a head five times in a row. You want to know whether that means you have acquired the skill of flipping a coin so it lands heads. Your p value is about .03 - pretty good! Publishable! But that does not mean you can conclude you have any such skill, with a .97 probability of being correct.

    To calculate that probability you need something like Bayes' Theorem - and that means you need two more pieces of information: you need to know the probability of flipping five consecutive heads if you have the skill, and you need to know the overall probability that someone like you in the relevant features has acquired the skill. That latter probability, in this case, is close to zero, and so is the probability that your hypothesis is correct based on the data you have recorded.

    One example like that was seldom sufficient.
    Plazma Inferno! likes this.
  8. wellwisher Banned Banned

    Statistics is tool that acts like a seeing eye dog. The blind man learns follow the lead of his seeing eye dog, moving around his neighborhood, with the apparent skills of someone with sight, while never really seeing.

    It does not bother the statistician if a one day a study finds that coffee is good for you, and the next study says it is bad for you. The blind man simply assumes his trusty seeing dog is leading him around obstacles, that may change day to day.

    The man of sight approaches his neighborhood differently. His dog follows his lead and follows him around and over obstacles.
  9. origin In a democracy you deserve the leaders you elect. Valued Senior Member

    If you read the actual studies and not just the popular press you may not be so blind.

Share This Page