*Tracy Payne, Ph.D.*Show bio

Tracy earned her doctorate from Vanderbilt University and has taught mathematics from preschool through graduate level statistics.

Lesson Transcript

Instructor:
*Tracy Payne, Ph.D.*
Show bio

Tracy earned her doctorate from Vanderbilt University and has taught mathematics from preschool through graduate level statistics.

How does the variable vary? This calls for a univariate analysis. There is a lot of information that can be garnered using univariate data. This lesson describes this type of data and the analyses conducted with it.

How many months does it take for avocado plants to produce their fruit? Which illnesses cause the greatest number of deaths? What is the maximum number of children who can ride safely on a schoolbus? What is the typical net worth of an American family? Each of these questions can be answered using univariate data. **Univariate data** is a collection of information characterized by or depending on only one random variable.

Take for example the last question: what is the typical net worth of an American family? We are interested in how responses vary from person to person when asked about their family's net worth. Only, no one would answer this question by providing every response received to the question. Instead, we would want to summarize the data using statistics that represent the majority of people in the population for whom the question is being asked.

Data is gathered for the purpose of answering a question, or more specifically, a research question. Univariate data does not answer research questions about relationships between variables, but rather it is used to describe one characteristic or attribute that varies from observation to observation. To describe how net worth varies, we would use univariate data to find the statistics that represent the center value for all American households along with how the other values spread from that center value.

A researcher would want to conduct a univariate analysis for two purposes. The first purpose would be to answer a research question that calls for a descriptive study on how one characteristic or attribute varies, such as describing how net worth varies from American family to American family.

A second purpose would be to examine how each characteristic or attribute varies before including two variables in a study using bivariate data or more than two variables in a study using multivariate data (**bivariate data** being for a 2-variable relationship and **multivariate data** being for a more than 2-variable relationship). For example, it would be beneficial to examine how net worth per family varies before including it in an analysis that correlates it with a second variable, say, educational attainment.

The statistics used to summarize univariate data describe the data's center and spread. There are many options for displaying such summaries. The most frequently used illustrations of univariate data are:

- Frequency distributions
- Histograms
- Stem and leaf plots
- Box and whisker plots
- Pie charts

The measures of central tendencies each tell us something different about the data, and each measure has advantages and disadvantages to its use.

The mean, which is calculated as the sum of all data points divided by the total number of data points, is the only measure that considers all the data in the set in determining the center point; it's also known as the average. The disadvantage of using the mean is that, by using all the data, very small or very large values heavily influence the calculation. In 2010, the mean net worth per American family was estimated at $463,800 by the Federal Reserve.

The median, which is found by putting all data points in order and locating the value that is in the center of all other values, is not influenced by extreme values. The median is a good indicator of center because half of the values fall above the median and half fall below. The disadvantage of using the median is in finding the median, which is time consuming for large datasets unless using a calculator or computer program. In 2010, the Federal Reserve estimated the median net worth per American family was $57,000.

In a normally distributed dataset, the mean and median would be equal, or at least very close together. Which of these two measures of center would be representative of your family's net worth: $463,800 or $57,000?

The mode, which is the value or values that appear in the set of data most frequently, is not used often with numerical data. With numerical data there may not be a mode or there may be too many modes. Take the family net worth example, You might have as many data points for the value $100,000 as you have for $101,000 and $102,000, etc.

**Univariate data** is a collection of information characterized by or depending on only one random variable. This type of data does not answer research questions about relationships between variables, but rather it is used to describe one characteristic or attribute that varies from observation to observation. This is opposed to **bivariate data**, which is for a 2-variable relationship and **multivariate data**, which is for a more than 2-variable relationship.

To describe univariate data requires univariate analyses. A researcher would use univariate data for a descriptive study on how one characteristic or attribute varies or to examine how each characteristic or attribute varies before including that variable in a study with two or more variables. A univariate analysis describes the mean, median, mode, and range of the data.

Univariate data can be illustrated using:

- Frequency distributions
- Histograms
- Stem and leaf plots
- Box and whisker plots
- Pie charts

To unlock this lesson you must be a Study.com Member.

Create your account

Are you a student or a teacher?

Already a member? Log In

BackRelated Study Materials

Browse by subject