Paradox of Averages
Paradoxes are statements which may be true, but which go against common sense. Here is one called Simpson's paradox. (Try "Simpson's paradox" in Google!)
Both Paul and John took 5 courses during their first year of college. Here are their grade point averages during their first two semesters:
|
1st semester |
2nd semester |
Paul |
2.0 |
3.75 |
John |
2.25 |
4.0 |
The grade average was computed as follows: A = 4, B = 3, C = 2, D = 1, F = 0.
Paul wrote home, "I'm doing well. My grade average for this year is 3.4".
John wrote home "I'm doing far worse than my friend Paul. My grade average is only 2.6".
Could this be true? John did better than Paul during both semesters.
---------------------------------------------------------------------------------
Answer.
It is possible. The grades of Paul and John could have been:
|
1st semester |
2nd semester |
whole year average |
Paul |
C (2.0) |
B A A A (3.75) |
3.4 |
John |
C C C B (2.25) |
A (4.0) |
2.6 |
John got his best grade while taking a light course load, so it didn't count much.
Paul got his only low grade while taking a light course load, so it didn't hurt much.
General formula.
If we have averages A1 and A2 for two different periods of time, then we also need to know the numbers of cases N1 and N2 from which they were computed, in order to compute the total average A.
The formula is
A = A1*N1/(N1 + N2) + A2*N2/(N1 + N2).
If we do not know N1 and N2, then we only know that if A1 < A2, then
A1 < A < A2.
Remark.
This story is usually told as a story of two baseball players. The first player had a better batting average than the second during both the first half and the second half of the season. But the second player had a better average for the whole season. So Simpson's paradox can be also called the Paradox of Percentages. (See also Ross, Ken (2004). A Mathematician at the Ballpark: Odds and Probabilities for Baseball Fans. pp. 12-13. New York: Pi Press.)