# What are Descriptive Statistics, and Why are They Useful?

## What Is Descriptive Statistics?

In the study of statistics, there are two main branches: descriptive and inferential. The main difference is that descriptive describes a data set as it is, and inferential attempts to make predictions, which goes beyond the values in the data set.

### What Is the Purpose of Descriptive Statistics?

As mentioned previously, **descriptive statistics** refers to various statistical calculations that are used to describe a data set as it appears. That's the meaning of descriptive statistics, but what is the purpose of descriptive statistics? One common example in sports would be the batting average. It is a value calculated considering every instance of a player taking their place at bat, which describes the average proportion of times when the player scores a hit. That's descriptive statistics that most people encounter quite frequently. Another example is the grade point average. It is a statistic that converts letter grades into numerical values and calculates a weighted average based upon the number of credits a course is worth. Descriptive statistics help quantitative research enormously, as they quantify some key aspects of data for direct comparison and easy conclusions.

## What Are Descriptive Statistics?

Imagine that you are interested in measuring the level of anxiety of college students during finals week in one of your courses. You have 11 study participants rate their level of anxiety on a scale from 1 to 10, with 1 being 'no anxiety' and 10 being 'extremely anxious.' You collect the ratings and review them. The ratings are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10. Your teacher asks you for a summary of your findings. How do you summarize this data? One way we could do this is by using descriptive statistics.

**Descriptive statistics** are used to describe or summarize data in ways that are meaningful and useful. For example, it would not be useful to know that all of the participants in our example wore blue shoes. However, it would be useful to know how spread out their anxiety ratings were. Descriptive statistics is at the heart of all quantitative analysis.

So how do we describe data? There are two ways: measures of central tendency and measures of variability, or dispersion.

## Descriptive Statistics Examples

With the above examples of batting average and grade point average, one can see few more examples of descriptive statistics. Here's a little more abstract example: consider the data set {2, 5, 7, 6, 8, 9, 5, 7, 10, 4}. What is the mean of this data set?

The **mean** referred to here is the *average* of a data set, and it is calculated by taking the sum of the data set and dividing that sum by the size of the data set. In other words, for a data set containing *n* elements,

$$\overline{x} = \frac{\sum x}{n} $$,

where *x* is an element of the data set, and *n* is the number of elements in the data set. Such mean is known as{eq}\overline{x} {/eq}, pronounced "x-bar," and it is one of the most common and useful descriptive statistics.

For this data set, the mean is:

$$\overline{x} = \frac{\sum x}{n} = \frac{63}{10} = 6.3 $$

## Types of Descriptive Statistics

There are two types of descriptive statistics: *measures of central tendency*, also called *measures of center*, and *measures of dispersion*, also called *measures of variability* or *spread*. The former describes the value(s) which the data set seems to be clustered about, while the latter describes how to spread out the data is. By considering the two together, one can determine a "typical" value for the data set. They can also know how far away from that typical value a data point is likely to be.

There are four kinds of descriptive statistics: **measures of frequency**, **measures of central tendency**, _measures of dispersion**, and **measures of position**. In this article, the focus is mainly on measures of central tendency and dispersion. **Measures of frequency** are concerned with how many items there are in data sets. These statistics include frequency, or counts, and relative frequency or proportions. Measures of the position include percentile rank and quartile rank (which is itself a subset of percentile rank). **

**Now, take a closer look at measures of central tendency and measures of dispersion: **

### Measures of Central Tendency

The **measures of central tendency** in statistics refer to the "middle" or "average" of a data set. There are three measures of central tendency, which are used in statistics:

The three measures of central tendency are:

- Mean - the average of a data set
- Median - the middle of a data set
- Mode - the value which appears most often in a data set

**Example 1**. Consider the data set {2,3,3,4,5,5,6,7,7,7,8}. Find the mean, median, and mode.

The **mean** of the data set is its average:

$$\overline{x} = \frac{\sum x}{n} = \frac{57}{11} \approx 5.18 $$

The **median** of the data set is the value in the middle. The data set is already in numerical order, so one needs to do is to find the middle term. There are 11 elements in the set where the middle will be the one data point with as many terms before and after it:

$$2,3,3,4,5, \color{red}5, 6,7,7,7,8 $$

The middle value is 5, so the median is 5. Compare this to the mean, which is 5.18. The mean is a little higher. Why?

Let's look at the **mode** for a clue. The mode is the value that appears most often. There are three 7's in the data set, and nothing else appears that often, so the mode is 7.

This could be a factor in pulling the mean a little higher since 7 is larger than 5, but even one large outlier (that is, a value far outside the range of the rest of the data set) could change the mean drastically.

Notice that mean and median rely on the data points having numerical values, so these measures of the center may only be used with *quantitative data.* On the other hand, the mode is the only measure of the center which can be used for *qualitative data.*

### Measures of Dispersion

The measures of dispersion describe how to spread out a data set is. These are sometimes also called measures of variability or measures of spread. The simplest measure of dispersion is the **range.**

The **range** of a data set is the difference of the largest and smallest values in the data set, calculated with the simple formula * max-min*.

The **standard deviation** of a data set is defined as being the *average distance from the mean* of any data point in the set. It is calculated with this formula:

$$s = \frac{\sum(x - \overline{x})^2}{n-1} $$

(Note: This formula, in particular, is for calculating the standard deviation of a *sample* data set. To calculate the standard deviation of a *population*, use *n* rather than *n-1* in the denominator.)

This formula, as one may guess, can get a little tedious for larger data sets, so often the standard deviation is calculated using technology.

Finally, the third measure of dispersion is called the **variance**. The variance is based upon the standard deviation, and its value is {eq}s^2 {/eq}. By simply squaring the standard deviation, variance can be calculated. Similarly, its square root must also be the standard deviation.

It explains how measures of dispersion may also be referred to as measures of variability. Now, discuss the process of using measures of dispersion with an example and detail each step of the process. Also, include definitions for terms, such as range and variance.

**Example 2** Consider the data set {2,3,3,4,5,5,6,7,7,7,8}. Find the range, standard deviation, and variance.

The **range** is easily calculated: simply subtract the smallest data point from the largest: 8 - 2 = 6.

The **standard deviation**, calculated using either the formula or technology, is about *s* = 1.99.

Finally, the **variance** is the square of the standard deviation: {eq}s^2 = 3.964 {/eq}.

These values give us a better idea of how to spread out the data is. The larger the standard deviation, the larger the variance, and the more spread out a data set is.

## Inferential Statistics

In addition to descriptive statistics, the study of statistics involves **inferential statistics**. This branch of statistics uses descriptive statistics drawn from sample data to make inferences, or predictions, about entire populations.

For example, suppose that a potato chips company claims that their bags of chips contain a mean of 8 oz. of chips. To know the true mean amount of chips in their bags, one can use inferential statistics to test that suspicion. To do so, one has to begin by taking a sample of 8 oz bags of chips, and finding a sample mean weight. This sample means would function as a point estimate which they can compare to the claimed population mean. If their sample means are low enough to meet certain criteria, they have sufficient reason to doubt the potato chip maker's claim. Inferential statistics is a powerful tool with much real-world application, but even it relies on descriptive statistics to make its predictions.

## Lesson Summary

**Descriptive statistics** is a term that describes some widely used quantities which can be used to describe data sets. The term "descriptive statistics" is used in counterpoint to **inferential statistics**, which makes predictions about entire populations not necessarily represented within the data set.

Descriptive statistics include **measures of central tendency** and **measures of dispersion**.

The measures of central tendency include:

- The
**mean**, or average, calculated with the formula {eq}\overline{x} = \frac{\sum x}{n} {/eq}.

- The
**median**, or middle value of the data - the value with as many data points above it as below. - The
**mode**, or the data point which appears most often.

The measures of dispersion include:

- The
**range**, calculated with*max - min*. - The
**standard deviation**, or average distance from the mean, calculated with {eq}s = \frac{\sum(x - \overline{x})^2}{n-1} {/eq} for samples, or {eq}\sigma = \frac{\sum(x - \overline{x})^2}{n} {/eq} for populations. (Use of technology such as a calculator is recommended)

- The
**variance**, which is the square of the standard deviation.

These descriptive statistics are valuable because they describe a data set just as it is. But, the measures of central tendency describe the value of the data points that seem to be clustered around, whereas the measures of dispersion describe how to spread out the data is. By using these two kinds of measures in concert, statisticians can perform a great many more complicated calculations, including those involved in inferential statistics.

## Measures of Central Tendency

You are probably somewhat familiar with the mean, but did you know that it is a measure of central tendency? **Measures of central tendency** use a single value to describe the center of a data set. The mean, median, and mode are all the three measures of central tendency.

The **mean**, or average, is calculated by finding the sum of the study data and dividing it by the total number of data. The **mode** is the number that appears most frequently in the set of data.

The **median** is the middle value in a set of data. It is calculated by first listing the data in numerical order then locating the value in the middle of the list. When working with an odd set of data, the median is the middle number. For example, the median in a set of 9 data is the number in the fifth place. When working with an even set of data, you find the average of the two middle numbers. For example, in a data set of 10, you would find the average of the numbers in the fifth and sixth places.

The mean and median can only be used with numerical data. The mode can be used with both numerical and **nominal data**, or data in the form of names or labels. Eye color, gender, and hair color are all examples of nominal data. The mean is the preferred measure of central tendency since it considers all of the numbers in a data set; however, the mean is extremely sensitive to **outliers**, or extreme values that are much higher or lower than the rest of the values in a data set. The median is preferred in cases where there are outliers, since the median only considers the middle values.

Knowing what we know, let's calculate the mean, median, and mode using the example from before. Again, the anxiety ratings of your classmates are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10.

Mean: (8+ 4 + 9 + 3 + 5 + 8 + 6 + 6 + 7 + 8 + 10) / 11 = 74 / 11 = The mean is 6.73.

Median : In a data set of 11, the median is the number in the sixth place. 3, 4, 5, 6, 6, **7**, 8, 8, 8, 9, 10. The median is 7.

Mode: The number 8 appears more than any other number. The mode is 8.

## Measures of Dispersion

We've got some pretty solid numbers on our data now, but let's say that you wanted to look at how spread out the study data are from a central value, i.e. the mean. In this case, you would look at **measures of dispersion**, which include the range, variance, and standard deviation.

The simplest measure of dispersion is the **range**. This tells us how spread out our data is. In order to calculate the range, you subtract the smallest number from the largest number. Just like the mean, the range is very sensitive to outliers.

The **variance** is a measure of the average distance that a set of data lies from its mean. The variance is not a stand-alone statistic. It is typically used in order to calculate other statistics, such as the standard deviation. The higher the variance, the more spread out your data are.

There are four steps to calculate the variance:

- Calculate the mean.
- Subtract the mean from each data value. This tells you how far each value lies from the mean.
- Square each of the values so that you now have all positive values, then find the sum of the squares.
- Divide the sum of the squares by the total number of data in the set.

The **standard deviation** is the most popular measure of dispersion. It provides an average distance of the data set from the mean. Like the variance, the higher the standard deviation, the more spread out your data are. Unlike the variance, the standard deviation is measured in the same unit as the original data, which makes it easier to interpret. It is calculated by finding the square root of the variance.

Let's calculate the measures of dispersion using the examples from before. Remember, the anxiety ratings of your classmates are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10.

Range: 10 (the largest number) - 3 (the smallest number). The range is 7.

Variance:

This is a multi-step process:

- Calculate the mean. We can see here, it's 6.73.
- Subtract the mean from each data value as shown on the table.
- Square each of the values so that you now have all positive values, then find the sum of the squares, which is 46.18.
- Divide the sum of the squares by the total number of data in the set to get our variance: 4.20.

Standard deviation: The square root of 4.20 is 2.05 is the standard deviation.

## Inferential Statistics

We know a lot about our initial data sampling now. But imagine that you wanted to take the anxiety ratings of the study participants and use them to draw the conclusion that the anxiety level of all college students during finals week is pretty high. You could not do this with descriptive statistics, since they simply describe the data in your study.

However, you could use inferential statistics. **Inferential statistics** allow you to take the data in your study sample and use it to draw conclusions, or inferences, that extend beyond the study participants. In other words, descriptive statistics provide a description of the study data; inferential statistics allow us to make inferences from our study data to the general population.

## Lesson Summary

**Descriptive statistics** describe or summarize a set of data. **Measures of central tendency** and **measures of dispersion** are the two types of descriptive statistics. The mean, median, and mode are three types of measures of central tendency. The range, variance, and standard deviation are three types of measures of dispersion. Inferential statistics allow us to draw conclusions from our data set to the general population.

## Descriptive Statistics - Vocabulary & Definitions

**Descriptive Statistics:**A method used for describing or summarizing data in a meaningful manner**Measures of Central Tendency:**A type of descriptive statistics that uses a single value to describe the center of a data. This includes mean, median and mode**Measures of Dispersion:**Another type of descriptive statistics that uses a collection of values to determine how spread out the data is from the central number. This includes range, variance and standard deviation**Inferential Statistics:**A method used for drawing conclusions after examining the data used in a study

## Learning Outcomes

Knowledge of descriptive statistics as presented in this lesson could enhance your ability to:

- Give the definition of descriptive statistics
- Understand how to use descriptive statistics in measuring central tendency and dispersion
- Explain what inferential statistics are and the way in which they're used to construct conclusions

To unlock this lesson you must be a Study.com Member.

Create your account

## What Are Descriptive Statistics?

Imagine that you are interested in measuring the level of anxiety of college students during finals week in one of your courses. You have 11 study participants rate their level of anxiety on a scale from 1 to 10, with 1 being 'no anxiety' and 10 being 'extremely anxious.' You collect the ratings and review them. The ratings are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10. Your teacher asks you for a summary of your findings. How do you summarize this data? One way we could do this is by using descriptive statistics.

**Descriptive statistics** are used to describe or summarize data in ways that are meaningful and useful. For example, it would not be useful to know that all of the participants in our example wore blue shoes. However, it would be useful to know how spread out their anxiety ratings were. Descriptive statistics is at the heart of all quantitative analysis.

So how do we describe data? There are two ways: measures of central tendency and measures of variability, or dispersion.

## Measures of Central Tendency

You are probably somewhat familiar with the mean, but did you know that it is a measure of central tendency? **Measures of central tendency** use a single value to describe the center of a data set. The mean, median, and mode are all the three measures of central tendency.

The **mean**, or average, is calculated by finding the sum of the study data and dividing it by the total number of data. The **mode** is the number that appears most frequently in the set of data.

The **median** is the middle value in a set of data. It is calculated by first listing the data in numerical order then locating the value in the middle of the list. When working with an odd set of data, the median is the middle number. For example, the median in a set of 9 data is the number in the fifth place. When working with an even set of data, you find the average of the two middle numbers. For example, in a data set of 10, you would find the average of the numbers in the fifth and sixth places.

The mean and median can only be used with numerical data. The mode can be used with both numerical and **nominal data**, or data in the form of names or labels. Eye color, gender, and hair color are all examples of nominal data. The mean is the preferred measure of central tendency since it considers all of the numbers in a data set; however, the mean is extremely sensitive to **outliers**, or extreme values that are much higher or lower than the rest of the values in a data set. The median is preferred in cases where there are outliers, since the median only considers the middle values.

Knowing what we know, let's calculate the mean, median, and mode using the example from before. Again, the anxiety ratings of your classmates are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10.

Mean: (8+ 4 + 9 + 3 + 5 + 8 + 6 + 6 + 7 + 8 + 10) / 11 = 74 / 11 = The mean is 6.73.

Median : In a data set of 11, the median is the number in the sixth place. 3, 4, 5, 6, 6, **7**, 8, 8, 8, 9, 10. The median is 7.

Mode: The number 8 appears more than any other number. The mode is 8.

## Measures of Dispersion

We've got some pretty solid numbers on our data now, but let's say that you wanted to look at how spread out the study data are from a central value, i.e. the mean. In this case, you would look at **measures of dispersion**, which include the range, variance, and standard deviation.

The simplest measure of dispersion is the **range**. This tells us how spread out our data is. In order to calculate the range, you subtract the smallest number from the largest number. Just like the mean, the range is very sensitive to outliers.

The **variance** is a measure of the average distance that a set of data lies from its mean. The variance is not a stand-alone statistic. It is typically used in order to calculate other statistics, such as the standard deviation. The higher the variance, the more spread out your data are.

There are four steps to calculate the variance:

- Calculate the mean.
- Subtract the mean from each data value. This tells you how far each value lies from the mean.
- Square each of the values so that you now have all positive values, then find the sum of the squares.
- Divide the sum of the squares by the total number of data in the set.

The **standard deviation** is the most popular measure of dispersion. It provides an average distance of the data set from the mean. Like the variance, the higher the standard deviation, the more spread out your data are. Unlike the variance, the standard deviation is measured in the same unit as the original data, which makes it easier to interpret. It is calculated by finding the square root of the variance.

Let's calculate the measures of dispersion using the examples from before. Remember, the anxiety ratings of your classmates are 8, 4, 9, 3, 5, 8, 6, 6, 7, 8, and 10.

Range: 10 (the largest number) - 3 (the smallest number). The range is 7.

Variance:

This is a multi-step process:

- Calculate the mean. We can see here, it's 6.73.
- Subtract the mean from each data value as shown on the table.
- Square each of the values so that you now have all positive values, then find the sum of the squares, which is 46.18.
- Divide the sum of the squares by the total number of data in the set to get our variance: 4.20.

Standard deviation: The square root of 4.20 is 2.05 is the standard deviation.

## Inferential Statistics

We know a lot about our initial data sampling now. But imagine that you wanted to take the anxiety ratings of the study participants and use them to draw the conclusion that the anxiety level of all college students during finals week is pretty high. You could not do this with descriptive statistics, since they simply describe the data in your study.

However, you could use inferential statistics. **Inferential statistics** allow you to take the data in your study sample and use it to draw conclusions, or inferences, that extend beyond the study participants. In other words, descriptive statistics provide a description of the study data; inferential statistics allow us to make inferences from our study data to the general population.

## Lesson Summary

**Descriptive statistics** describe or summarize a set of data. **Measures of central tendency** and **measures of dispersion** are the two types of descriptive statistics. The mean, median, and mode are three types of measures of central tendency. The range, variance, and standard deviation are three types of measures of dispersion. Inferential statistics allow us to draw conclusions from our data set to the general population.

## Descriptive Statistics - Vocabulary & Definitions

**Descriptive Statistics:**A method used for describing or summarizing data in a meaningful manner**Measures of Central Tendency:**A type of descriptive statistics that uses a single value to describe the center of a data. This includes mean, median and mode**Measures of Dispersion:**Another type of descriptive statistics that uses a collection of values to determine how spread out the data is from the central number. This includes range, variance and standard deviation**Inferential Statistics:**A method used for drawing conclusions after examining the data used in a study

## Learning Outcomes

Knowledge of descriptive statistics as presented in this lesson could enhance your ability to:

- Give the definition of descriptive statistics
- Understand how to use descriptive statistics in measuring central tendency and dispersion
- Explain what inferential statistics are and the way in which they're used to construct conclusions

To unlock this lesson you must be a Study.com Member.

Create your account

- Activities
- FAQs

## Descriptive Statistics Thought Questions

#### Calculation Exercise

Suppose there are a group of 10 students taking a class at NoGo University. The students took a test and got the following grades: 65, 72, 78, 80, 81, 86, 89, 91, 92, 99. Calculate the descriptive statistic values of mean, median, and mode for this data.

#### Research Exercise

The lesson mentioned the measures of central tendency to be the mean, median, and mode. Search on the Internet to find other measures of central tendency. The lesson mentioned the measures of dispersion to be the range, variance, and standard deviation. Search again on the Internet to find other measures of dispersion. Which measures of central tendency are the most commonly used? Which measures of dispersion are most commonly used? Why are the other measurements of central tendency and of dispersion not commonly used? Write a report explaining your findings.

#### Calculation Exercise

The Mega Calorie Diet has been designed to help people gain weight. A group of five people on this diet gain 10 pounds, 15 pounds, 14 pounds, 8 pounds, and 13 pounds. The inventors of the diet want to know the measures of dispersion for this data, namely, range, variance, and standard deviation. Please calculate these values for them.

#### Discussion Exercise

Descriptive statistics are used frequently in quality assurance to describe a sample from a manufacturing process. Both the measures of central tendency and dispersion are monitored. Discuss what is happening with the items produced if the measures of central tendency are too high or too low. Then discuss what is happening with the items produced if the measures of dispersion are high or low. Is it possible for the measures of central tendency to be extreme, but the measures of dispersion to be fine? Is it possible for the measures of dispersion to be extreme, but the measures of central tendency to be fine?

#### What are the four types of descriptive statistics?

The four types of descriptive statistics are measures of frequency, measures of central tendency, measures of dispersion, and measures of position.

Measures of frequency include the count, frequency, and relative frequency. Measures of central tendency include the mean, median, and mode. Measures of dispersion include the range, standard deviation, and variance. Measures of position include percentile and quartile ranks.

#### What is an example of descriptive statistics in a research study?

Descriptive statistics examples in a research study include the mean, median, and mode. Studies also frequently cite measures of dispersion including the standard deviation, variance, and range. These values describe a data set just as it is, so it is called descriptive statistics.

#### What do you mean by descriptive statistics?

Descriptive statistics describe a data set as it is. In other words, descriptive statistics does not attempt to draw any conclusions for broader data sets or entire populations.

Measures of central tendency describe the value(s) around which a data set seems clustered, and measures of dispersion show how widespread the data is, based only on the information in the sample.

### Register to view this lesson

### Unlock Your Education

#### See for yourself why 30 million people use Study.com

##### Become a Study.com member and start learning now.

Become a MemberAlready a member? Log In

Back### Resources created by teachers for teachers

I would definitely recommend Study.com to my colleagues. Itâ€™s like

**a teacher waved a magic wand and did the work for me.** I feel like itâ€™s a lifeline.