Cumulative frequency is a running total of frequencies. You plot it against the upper class boundaries of grouped data to produce a characteristic S-shaped curve, then read off the median, lower quartile, and upper quartile — which you then use to draw a box plot showing the spread of the data.
What is cumulative frequency?
Cumulative frequency at any point is the total number of values that are less than or equal to that point. You build it by adding each frequency to the running total as you move through the class intervals.
| Time (minutes) | Frequency | Cumulative frequency |
|---|---|---|
| 0 ≤ t < 10 | 4 | 4 |
| 10 ≤ t < 20 | 11 | 15 |
| 20 ≤ t < 30 | 17 | 32 |
| 30 ≤ t < 40 | 13 | 45 |
| 40 ≤ t < 50 | 5 | 50 |
The final cumulative frequency (50) always equals the total number of values.
How do you draw a cumulative frequency graph?
Step 1 — Add a cumulative frequency column to your table (see above).
Step 2 — Plot each cumulative frequency against the upper class boundary of its interval (not the midpoint). For the table above: (10, 4), (20, 15), (30, 32), (40, 45), (50, 50). Also plot (0, 0) — before any data, the running total is zero.
Step 3 — Join the points with a smooth curve (an S-shape, sometimes called an ogive).
Step 4 — Label both axes: horizontal for the variable (e.g. "Time in minutes"), vertical for "Cumulative frequency."
Do not join the points with straight line segments — GCSE mark schemes expect a smooth curve.
How do you find the median and quartiles from the graph?
For a data set of n values:
- Median — read off at n/2 on the cumulative frequency axis.
- Lower quartile (LQ) — read off at n/4.
- Upper quartile (UQ) — read off at 3n/4.
For our example (n = 50):
| Measure | CF value | Time read from graph |
|---|---|---|
| Lower quartile | 50/4 = 12.5 | ≈ 18 min |
| Median | 50/2 = 25 | ≈ 25 min |
| Upper quartile | 3×50/4 = 37.5 | ≈ 34 min |
Interquartile range (IQR) = UQ − LQ = 34 − 18 = 16 min
The IQR measures spread. A larger IQR means the data is more spread out around the median.
How do you draw a box plot from a cumulative frequency graph?
A box plot (also called a box-and-whisker plot) displays five values: the minimum, lower quartile, median, upper quartile, and maximum.
Step 1 — Read the five values from the graph (or the data set). For grouped data, use 0 as the minimum and 50 (the top of the last class) as the maximum.
Step 2 — Draw a number line covering the full range.
Step 3 — Mark the five values with short vertical lines.
Step 4 — Draw a box from LQ to UQ.
Step 5 — Draw a vertical line inside the box at the median.
Step 6 — Draw whiskers from the box out to the minimum and maximum.
Box plot values for the example: min = 0, LQ = 18, median = 25, UQ = 34, max = 50.
How do you compare two box plots?
Exam questions often show two box plots and ask you to compare the distributions. Always comment on two things: a measure of location (median) and a measure of spread (IQR or range).
Example answer template:
"Group A has a higher median (25 min) than Group B (18 min), so on average Group A took longer. Group A also has a larger IQR (16 min compared with 10 min), so Group A's times were more spread out / less consistent."
Avoid vague statements like "Group A did better." Be specific: state the values you are comparing.
Frequently asked questions
Why do you plot against the upper class boundary, not the midpoint?
The cumulative frequency at a given point represents the number of values up to and including that boundary. A value plotted at the upper boundary (say, 30) tells you how many pieces of data are less than or equal to 30. Using the midpoint would give an estimate that lags half a class behind the true position.
What if the total is odd — which value gives the median?
For n values, read off at n/2 on the vertical axis — do not round up to (n+1)/2 as you would for a list. On a cumulative frequency graph you are estimating a continuous distribution, so n/2 is the correct position. The formula (n+1)/2 is used only when reading the median directly from an ordered list.
What does the interquartile range actually tell you?
The IQR is the range of the middle 50% of the data. A small IQR means that most values are clustered tightly around the median; a large IQR means the data is more spread out. It is more useful than the full range because it is not affected by extreme values (outliers) at either end.
Can I draw a box plot without a cumulative frequency graph?
Yes, if you have the raw data in a list. Sort the list, identify the minimum, maximum, and the three quartile positions. Box plots from raw data appear in both KS3 and GCSE, but cumulative frequency graphs are the main route to box plots from grouped data at GCSE.
For Socratic GCSE statistics practice, see aitutors.me.