Become a fan of h2g2
An average summarises a group of numbers. There are three main types of averages: mean, median and mode. Each of these will be looked at in turn.
This is the most commonly used average. The mean is calculated by adding up the numbers in a sample and dividing that answer by the sample size. This is the only type of average that takes into account all the numbers in the sample.
Example - Here is a sample of numbers:
The first thing to do is add the numbers up. In this case, the result is 23.
All that is left to do is divide this number by the sample size. In this case, we are dividing 23 by 8 (because there are eight numbers altogether) and we get the result 2.875 .
Potential Problems - Mean averages have two linked problems:
If you have a large number of small values with a few very large values in your sample, mean averages get skewed: the mean is nearer to the bigger values even though the small values there are more smaller numbers. If you have a few small values and a few large values, the mean average can get skewed this way too.
If you have one, or more, outlying values that do not follow the general trend of the numbers in a sample, the mean average can be affected more dramatically than intended.
Example of the Effect of Outliers - In this case, the number 100 has been added to the sample above:
The sum of these numbers is 123. If we divide this by the sample size - 9, we get 13.6666 (recurring) which does not represent the earlier numbers.
This type of average is the middle number in a sample and requires the numbers to be in order.
Example - Here is a sample of numbers to illustrate median:
To use the median, these numbers must be placed in order, like this:
Here the median is 2.5 - it comes out as this because there is a even sample size here. Therefore there is no middle number. To work out the median, you need to take the 2 and the 3 which are the middle numbers and get the mean of them - which is 2.5.
With an odd sample it is much easier: you just take the middle number as the median.
Potential Problems - One problem with using median is that it requires the numbers to be put in order first. For a large set of numbers, this task can be extremely labourious.
This type of average is the number that occurs the most times in the sample. Where the mean has problems with 'representativeness', mode focuses on the most common numbers and gives less or no attention to less frequently-occuring numbers.
Example - another list of numbers:
The mode here is 2 as it appears three times. Note: if there are two numbers which are equally common in the sample, then you take both as the mode.
Potential Problems - Mode is less useful when you have a lot of values that are close together but have not been rounded to the nearest whole number. This means an inaccurate mode of the numbers will be taken. It would be better in this example to round the numbers first before using mode.
A class of 15 students took a test to be marked out of 10. Seven students got 8 marks, 4 got 7 marks, 2 got 6 marks, 1 got 5 marks, and 1 got 4 marks.
Mean - The total number of marks the students got is 105. Divide that by the number of students, 15, and you get 7. That is the mean number of the students' marks.
Median - Below, the marks the students received are shown, in order from the highest to the lowest:
The median number is the middle number, so the median of the students' marks is 7.
Mode - The marks the students received were:
The mode number is the number that occurs the most times, so the mode number of the students marks is 8.
The average mark the students got depends on which average is used.
Another Real-world Application
From telephony, there are some interesting statistics that may help to understand the differences between mean, median and mode more easily.
- The mean telephone call - normally around two minutes 30 seconds.
- The median telephone call - normally around 40 seconds.
- The modal telephone call - normally less than five seconds (about two seconds is usual).
The interesting thing is why we get these figures for telephone calls. Phone calls split into about three different types:
Very short - A failed call. You phoned and got somebody's voicemail and don't leave a message; a fax failed (you put in the person's voice number by mistake). With a relatively high number of failed faxes and conversations being just a few seconds long (ie within a small range of possible values), this causes a modal around two seconds, maybe three.
Fax - A single page takes between 35 and 50 seconds. This causes a huge block of calls that nearly always falls in the middle of an ordered list of call lengths (but still is a smaller figure than the connected but failed calls, and furthermore, in a greater range of possible values).
Conversation - Well... people can talk for quite some while. Because some people can talk for a very long time it skews the mean much higher. The calls are long, but infrequent, however, and so they don't affect the median or modal at all.
A graph of number of calls against time shows a huge peak between one and ten seconds and a slightly smaller peak between 35 and 50 seconds. After that, it tails off.