P<0.05 means the chance of this result being a statistical fluke is less than 0.05, or 1 in 20. It’s the most common standard for being considered relevant, but you’ll also see p<0.01 or smaller numbers if the data shows that the likelihood of the results being from chance are smaller than 1 in 20, like 1 in 100. The smaller the p value the better but it means you need larger data sets which costs more money out of your experiment budget to recruit subjects, buy equipment, and pay salaries. Gotta make those grant budgets stretch so researchers will go with 1 in 20 to save money since it’s the common standard.
In psychology especially, and some other fields, the ‘null hypothesis’ is used. That means that the researcher ‘assumes’ that there is no effect or difference in what he is measuring. If you know that the average person smiles 20 times a day, and you want to check if someone (person A) making jokes around a person (person B) all day makes person B smile more than average, you assume that there will be no change. In other words, the expected outcome is that person B will still smile 20 times a day.
The experiment is performed and data collected. In this example, how many times person B smiled during the day. Do that for a lot of people, and you have your data set. Let’s say that they discovered the average amount of smiles per day was 25 during the experimental procedure. Using some fancy statistics (not really fancy, but it sure can seem like it) you calculate the probability that you would get an average of 25 smiles a day if the assumption that making jokes around a person would not change the 20-per-day average. The more people that you experimented on, and the larger the deviance from the assumed average, the lower the probability. If the probability is less than 5%, you say that p<0.05, and for a research experiment like the one described above, that’s probably good enough for your field to pat you on the back and tell you that the ‘null hypothesis’ of there being no effect from your independent variable (the making jokes thing) is wrong, and you can confidently say that making jokes will cause people to smile more, on average.
If you are being more rigorous, or testing multiple independent variables at once, as you might for examining different therapies or drugs, you starting making your X smaller in the p<X statement. Good studies will predetermine what X they will use, so as to avoid making the mistake of settling on what was ‘good enough’ as a number that fits your data.
How might one translate this to everyday language?
P<0.05 means the chance of this result being a statistical fluke is less than 0.05, or 1 in 20. It’s the most common standard for being considered relevant, but you’ll also see p<0.01 or smaller numbers if the data shows that the likelihood of the results being from chance are smaller than 1 in 20, like 1 in 100. The smaller the p value the better but it means you need larger data sets which costs more money out of your experiment budget to recruit subjects, buy equipment, and pay salaries. Gotta make those grant budgets stretch so researchers will go with 1 in 20 to save money since it’s the common standard.
To expand on the other fella’s explanation:
In psychology especially, and some other fields, the ‘null hypothesis’ is used. That means that the researcher ‘assumes’ that there is no effect or difference in what he is measuring. If you know that the average person smiles 20 times a day, and you want to check if someone (person A) making jokes around a person (person B) all day makes person B smile more than average, you assume that there will be no change. In other words, the expected outcome is that person B will still smile 20 times a day.
The experiment is performed and data collected. In this example, how many times person B smiled during the day. Do that for a lot of people, and you have your data set. Let’s say that they discovered the average amount of smiles per day was 25 during the experimental procedure. Using some fancy statistics (not really fancy, but it sure can seem like it) you calculate the probability that you would get an average of 25 smiles a day if the assumption that making jokes around a person would not change the 20-per-day average. The more people that you experimented on, and the larger the deviance from the assumed average, the lower the probability. If the probability is less than 5%, you say that p<0.05, and for a research experiment like the one described above, that’s probably good enough for your field to pat you on the back and tell you that the ‘null hypothesis’ of there being no effect from your independent variable (the making jokes thing) is wrong, and you can confidently say that making jokes will cause people to smile more, on average.
If you are being more rigorous, or testing multiple independent variables at once, as you might for examining different therapies or drugs, you starting making your X smaller in the p<X statement. Good studies will predetermine what X they will use, so as to avoid making the mistake of settling on what was ‘good enough’ as a number that fits your data.