Mean from a List of Data
Questions designed to stretch thinking, reveal misconceptions, and spark mathematical reasoning.
Convince Me That…
Students must construct a mathematical argument for why each statement is true.
The mean is the sum divided by how many values there are. Adding up: 2 + 3 + 8 + 9 + 8 = 30. There are 5 values, so the mean is 30 ÷ 5 = 6.
Notice that 6 is not even one of the values in the list, and it is very different from the median (the middle value when ordered is 8). This highlights the ‘mean is the middle number’ misconception — the mean and the median are not the same thing. The two lower values (2 and 3) pull the mean down well below the median.
Adding up all six values: 1 + 2 + 3 + 4 + 5 + 0 = 15. There are 6 values (the zero counts as a value), so the mean is 15 ÷ 6 = 2.5.
Two common mistakes arise here. First, the ‘zero doesn’t count’ misconception: some students ignore the 0 and divide by 5 instead of 6, getting 15 ÷ 5 = 3. Zero is a perfectly valid data point and must be included in the count. Second, the ‘mean must be a whole number’ misconception: some students feel the answer “should” be a whole number — but 2.5 is correct and the mean does not have to be a whole number.
Consider the data set: 1, 1, 1, 1, 96. The mean is (1 + 1 + 1 + 1 + 96) ÷ 5 = 100 ÷ 5 = 20. The mean (20) is bigger than four of the five values. One very large value (an outlier) can drag the mean far away from where most of the data sits.
This challenges the ‘mean is always typical’ misconception — the idea that the mean always represents what is “normal” in the data. While the mean always lies between the smallest and largest values, it can be pulled heavily towards extreme values.
A helpful way to visualize this is the “Balance Point” model. If we look at a smaller example with an outlier, like {2, 3, 10}:
Total distance on the left (3 + 2 = 5) perfectly balances the outlier’s distance on the right (5). This is why one large number pulls the mean so far!
Consider {5, 5, 5, 5, 5} and {1, 3, 5, 7, 9}. Both have a sum of 25, and both have 5 values, so both have a mean of 5. Yet the first set has no spread at all, while the second set is spread from 1 to 9.
The ‘mean tells you everything’ misconception leads students to think that knowing the mean fully describes a data set. In fact, the mean only captures the “balance point” — it tells you nothing about how spread out the values are. This is why we sometimes need additional measures like range to fully describe a data set.
Give an Example Of…
Think carefully — the fourth box is a trap! Give a non-example that looks right but isn’t.
Example: 6, 7, 8, 9, 10 — sum is 40, and 40 ÷ 5 = 8 โ
Another: 4, 6, 8, 10, 12 — sum is 40, and 40 ÷ 5 = 8 โ
Creative: 1, 1, 1, 1, 36 — sum is 40, and 40 ÷ 5 = 8 โ (the values don’t have to be close to the mean)
Trap: 6, 7, 8, 9, 11 — the ‘numbers near the mean are fine’ misconception. A student might choose numbers that “feel” centred on 8, but the sum is 41 and 41 ÷ 5 = 8.2, not 8. The five numbers must add up to exactly 40 (since 8 × 5 = 40).
Example: 1, 2 — sum is 3, mean is 3 ÷ 2 = 1.5 โ
Another: 1, 2, 3, 4 — sum is 10, mean is 10 ÷ 4 = 2.5 โ
Creative: 0, 0, 0, 0, 1 — sum is 1, mean is 1 ÷ 5 = 0.2 โ (the mean can be a decimal even when all values are 0 or 1)
Trap: 2, 4, 6, 8 — the ‘spread means non-whole’ misconception. A student might think “these are spread out, so the mean won’t be whole,” but the sum is 20 and 20 ÷ 4 = 5, which is a whole number. What matters is whether the sum divides evenly by the count.
Example: Add 1 — new sum is 25, new mean is 25 ÷ 5 = 5 โ (lower than 6)
Another: Add 4 — new sum is 28, new mean is 28 ÷ 5 = 5.6 โ (lower than 6)
Creative: Add 0 — new sum is 24, new mean is 24 ÷ 5 = 4.8 โ (adding zero still changes the mean because the count increases)
Trap: Add 6 — the ‘below the biggest values must lower the mean’ misconception. A student might think “6 is below the 7 and the 9, so it must bring the mean down,” but 6 is the current mean (24 ÷ 4 = 6), and adding a value equal to the mean doesn’t change it. New sum 30 ÷ 5 = 6 — no change!
Example: −1, 3, 7 — sum is 9, mean is 9 ÷ 3 = 3 โ
Another: −2, 4, 7 — sum is 9, mean is 9 ÷ 3 = 3 โ
Creative: −10, −5, 0, 10, 20 — sum is 15, mean is 15 ÷ 5 = 3 โ (a larger set with multiple negatives and a zero)
Trap: −3, 3, 6 — the ‘cancelling pairs’ misconception. A student might think “the −3 and 3 cancel out, leaving 6, so the mean is 3.” But the sum is (−3) + 3 + 6 = 6, and 6 ÷ 3 = 2, not 3. Cancelling pairs doesn’t tell you the mean — you must still divide the total sum by the count. Even when values cancel out to zero, that zero is still a number that holds a place in the data set’s sum.
Always, Sometimes, Never
Is the statement always true, sometimes true, or never true? Students should justify their decision with examples.
This is only true when the sum of the values divides exactly by how many values there are. For example, {2, 4, 6} has sum 12 and mean 12 ÷ 3 = 4 (a whole number). But {1, 2} has sum 3 and mean 3 ÷ 2 = 1.5 (not a whole number).
Many students fall for the ‘whole numbers in, whole number out’ misconception, assuming that whole-number inputs guarantee a whole-number mean. Whether the mean is whole depends entirely on whether the sum is divisible by the count.
For any two numbers a and b, the mean is (a + b) ÷ 2, which is exactly the midpoint between them. For example, the mean of 3 and 11 is 14 ÷ 2 = 7, which sits exactly halfway between 3 and 11.
This even works with negatives: the mean of −4 and 10 is 6 ÷ 2 = 3, which is indeed halfway between −4 and 10. Students may think this is only true for “nice” numbers, but the formula guarantees it for any two values.
The mean always lies between (or equal to) the smallest and largest values in the data set. It can never exceed the largest value. For example, in {2, 5, 8}, the mean is 5, which is not larger than 8.
Even if all values are equal (e.g. {4, 4, 4}), the mean equals 4 — which is equal to every value, not larger. The ‘mean can be outside the data range’ misconception leads some students to believe the mean could sit above or below all the data, but mathematically the mean is always “trapped” between the minimum and maximum values.
The data set {2, 4, 6} has mean 4. Adding 10 gives {2, 4, 6, 10} with mean 22 ÷ 4 = 5.5 — the mean increased. But adding 1 gives {1, 2, 4, 6} with mean 13 ÷ 4 = 3.25 — the mean decreased.
The rule is: adding a value above the current mean increases it, adding a value below the current mean decreases it, and adding a value equal to the current mean leaves it unchanged. The ‘more data means bigger mean’ misconception leads students to assume that adding more numbers always makes the mean grow, without considering what value is being added.
Let the three consecutive numbers be x − 1, x, and x + 1. The sum is (x − 1) + x + (x + 1) = 3x. Dividing the sum by 3 gives 3x ÷ 3 = x, which is exactly the middle number.
This provides a great introduction to using algebraic reasoning to prove a statistical property, reinforcing that the mean truly acts as the central balance point for perfectly symmetrical data.
Odd One Out
Which is the odd one out? Can you make a case for each one? There’s no single right answer!
Explain the Mistake
Each example contains a deliberate error targeting a common misconception. Can you find where and why the reasoning goes wrong?
Answer: 6.25
Reasoning: “I added them up: 4 + 7 + 5 + 9 = 25. Then I divided by 4 because there are 4 numbers. 25 ÷ 4 = 6.25.”
The student has made the ‘zero doesn’t count’ error — they skipped the 0 entirely, leaving it out of both the sum and the count. Zero is a valid data value. The correct calculation is: (4 + 7 + 0 + 5 + 9) ÷ 5 = 25 ÷ 5 = 5.
Although the sum of the non-zero values is still 25 (since adding 0 doesn’t change the total), there are 5 values in the data set, not 4. Skipping zero gives a count that is too small, which inflates the mean. This misconception often surfaces when students see 0 as “nothing” rather than as a genuine data point.
Answer: 6 โ
Reasoning: “I put the numbers in order: 2, 4, 6, 8, 10. The one in the middle is 6, so the mean is 6.”
The student has confused the mean with the median. They found the middle value of the ordered list (the median), which happens to be 6. The actual mean calculation is (4 + 6 + 2 + 10 + 8) ÷ 5 = 30 ÷ 5 = 6 — also 6. So the answer is correct, but the method is wrong.
This matters because the mean and median are only guaranteed to be equal in specific cases (like symmetric data). For a different data set, such as {2, 4, 6, 8, 20}, the median is still 6 but the mean is 40 ÷ 5 = 8. The student’s method would give the wrong answer in that case.
Answer: 5
Reasoning: “The most common number is 5 — it appears three times. So the mean is 5.”
The student has confused the mean with the mode. The mode (most frequent value) is indeed 5, but the mean is calculated by summing all values and dividing: (3 + 5 + 5 + 5 + 12) ÷ 5 = 30 ÷ 5 = 6.
The mean and the mode are different measures of average. In this data set, the outlier value of 12 pulls the mean above the mode. Students should check: does the method involve adding up and dividing? If not, they are not finding the mean.
Answer: 9
Reasoning: “The mean is 9, so the missing number must be 9. That’s what the mean means — it’s the value everything balances at.”
The student has assumed that the ‘missing value must equal the mean’, which is incorrect. To find the missing value, work backwards: if the mean is 9 and there are 5 numbers, the total sum must be 9 × 5 = 45. The four known values add up to 6 + 10 + 12 + 7 = 35. So the fifth number is 45 − 35 = 10, not 9.
The mean is the balance point of the entire data set, not a value that every data point must equal. In fact, in this data set, none of the five values (6, 7, 10, 10, 12) actually equals the mean of 9. Students who understand the “total = mean × count” relationship can work backwards reliably.
Answer: 75
Reasoning: “The mean of 60 and 90 is 75. So the overall mean is 75.”
The student has fallen for the classic ‘averaging averages’ misconception. They treated both classes as having equal weight, ignoring the differing sizes of the groups.
Because Class A has twice as many students as Class B, the true mean will be pulled closer to Class A’s score of 60. To find the correct combined mean, calculate the total marks: Class A total is 20 × 60 = 1200, Class B total is 10 × 90 = 900. The overall mean is (1200 + 900) ÷ 30 = 2100 ÷ 30 = 70.