Betsy teaches college physics, biology, and engineering and has a Ph.D. in Biomedical Engineering
In this lesson, learn about measures of central tendency like mean, median and mode. Learn how to calculate these measures and how to choose which ones to use in any situation.
Measures of Central Tendency
Susan is a third grade teacher, and in her classroom, there's a small library of books that students can check out to read. Susan decides to track the number of books checked out by each student for a month, recording all the data she collects in a table like this:
Once Susan has compiled her data, how can she determine the typical number of books checked out by the children in her class and summarize and present this information? Well, Susan can use one of the common measures of central tendency that represent the central position of a set of data.
Mean, Median & Mode
There are three important measures of central tendency commonly used to summarize a set of data: mean, median, and mode. The mean is the average of all the values. To find the mean of the data shown in this table, you'd add up the total number of books read and divide it by the number of students.
The median is the number that falls in the exact center of all the measurements. To find the median, arrange all of the data points from smallest to largest to find the one that's right in the middle. If there's an even number of measurements, choose the number that falls right between the two middle values.
The third measure of central tendency is the mode of the data set, or the number that occurs most frequently. For Susan's data, the mode would be 3, because 4 students read 3 books and this was the most common number of books read. Although there was only one mode in this case, it's possible to have more than one if there were two or more data points that occurred the same number of times.
Measures of Variability
In addition to reporting the mean, median, or mode of a data set, it's often helpful to know something about how spread out the data is as well. There are several ways to measure a quantity known as variability, or the amount of spread in a set of data.
One way to describe the variability in a data set is to calculate the range, or the difference between the highest and lowest values in a data set. In our example, range = 8 - 0 = 8 books.
Another common measure of variability is known as standard deviation, which measures how far each one of the measurements is from the mean. If the standard deviation is low, it means that most of the values fall near the mean, so the variability is low. If a lot of the values are far from the mean, then the variability - and, therefore, the standard deviation - will be high.
How to Choose a Measure
Now that you know about the common ways to measure central tendency and variability, how do you determine which one is best to use?
If the data follows a normal distribution, this means that the data points are equally distributed on each side of the mean.
Over 79,000 lessons in all major subjects
Get access risk-free for 30 days,
just create an account.
For normally distributed data, the median and mean will be very similar, so either one can be used; however, using the mean is the more common approach. Sometimes, instead of being normally distributed, there may be outliers that skew the data to one side or the other. Outliers that are far from the mean can change the mean a lot, so that it doesn't reflect the midpoint of the data. If there are outliers and the data doesn't appear to be normally distributed, then the median is a better choice than the mean because it more accurately reflects the true midpoint of your data.
Let's look at another example. Suppose that the owner of a restaurant wants to know how many pizzas he should expect to sell on a typical day. He records the number sold for several days, but on one day, there is a big order and a lot more pizzas than usual are sold. In this case, it would be better to use the median to represent a typical day since that one large order could have a big influence on the mean, but not the median.
While both median and mode are great for numerical data, the mode is a better measure to use when the data is non-numerical. How can data not have numbers? Well, imagine that you own an ice cream shop and you want to know which ice cream flavors are the most popular. In this case, you'd keep track of all of the orders and then determine the mode (the most commonly ordered flavor). Anytime you want to use your data to determine which item is the most popular, the mode is a good choice.
Measures of central tendency represent the central position of a set of data. The mean is the average of all the values. The median is the number that falls in the exact center of all the measurements, while the mode is the number that occurs most frequently.
There are several ways to measure variability, or the amount of spread in a set of data. One way to measure variability, range, is the difference between the highest and lowest values in a data set. Another common measure of variability is known as standard deviation, which measures how far each of the measurements in a data set are from the mean value.
Numerical data is represented best by the median or the mean. When data is normally distributed with no outliers, then the mean and median can both be used to represent the true center of the data. However, the median is a better measure to use than the mean when there are outliers or the data is skewed away from a normal distribution. Mode is the best measure to use when data is non-numerical, or in any situation where you want to know the most popular option among a group.
Did you know… We have over 200 college
courses that prepare you to earn
credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the
first two years of college and save thousands off your degree. Anyone can earn
credit-by-exam regardless of age or education level.