There is a saying that if you look hard enough for something, you will eventually find it. In medical writing we often have to assess the validity of biomedical research, because--as in the saying above--some studies will "find" what they are looking for, even if it is not real. The results can be due to chance.
The likelihood that the results of a study were due to chance is measured by the P value—the smaller the P value, the higher the statistical significance of the study, thus, the more likely the results were not found by chance.
If the P value is 0.6, there is a 60 percent chance that the difference is coincidental. By convention, a P value of 0.05 or less is considered a “significant difference” because the probability that the difference is due to chance is only five percent (or less).
For example, if you toss a coin twice, the most likely outcome is one head and one tail. If you actually got heads twice you wouldn’t think there was a problem with the coin or how you tossed it, because that is the result you would expect to find twenty five percent of the time that you tossed the coin twice. On the other hand, if you tossed it 1,000 times and got heads every time, you would suspect that there was something wrong with the coin.
That’s where P values come in; they are a mathematical estimate of the probability of a difference of finding 1,000 heads—instead of the expected 500—just by chance alone (luck) or because it may be a real difference (perhaps the coin was heavier on one side).
Check Tom Lang's article, Twenty Statistical Errors Even YOU Can Find in Biomedical Research Articles (see Error #6), for a discussion of how P values are often misinterpreted.
Also, the book I reviewed here, includes an interesting account of how the P value was born.