# Univariate Statistics Analysis & Examples

## What is Univariate Data?

**Univariate data** is a term used in statistics to describe data that consists of observations on only one characteristic or attribute. There is only one variable in univariate data. The analysis of univariate data is thus the most basic type of analysis because it deals with only one variable that changes. It is uninterested in causes or relationships, and its primary objective is to explain the data and detect patterns within it. The salaries of workers in an industry are a simple example of univariate data.

The main characteristics of univariate data are as follows:

- Univariate data gathers data around a single, random variable. It describes each variable separately.
- Univariate data describes the variable's response pattern.

### Univariate Data Examples

- The salaries of workers in a specific industry; the variable in this example is workers' salaries.
- The heights of ten students in a class are measured; the variable here is the students' heights.
- A veterinarian wants to weigh 20 puppies; the variable, in this case, is the weight of the puppies.

## Univariate Statistics

**Univariate statistics** focus on one variable at a time and does not involve testing variables against one another. Rather, it gives the researcher the opportunity to describe individual variables. As a result, this type of statistics is also known as **descriptive statistics**. The patterns found in this type of data can be described using the following:

- Central tendency measures (mean, mode, and median)
- Data dispersion (standard deviation, variance, range, minimum, maximum, and quartiles)
- Frequency distribution tables
- Pie charts
- Frequency polygon histograms
- Bar charts

## Univariate Analysis

**Univariate analysis **is the most fundamental type of statistical data analysis technique. The data in this case only has one variable and does not have to deal with a cause-and-effect relationship. Consider conducting a classroom survey. The analysts would like to count how many boys and girls are in the room. The data presented here only discusses the number, which is a single variable, and the variable quantity. The primary goal of the univariate analysis is to describe the data to discover patterns. The univariate analysis will take data, summarize it, and look for patterns.

However, the univariate analysis does not look at more than one variable at a time or their relationship. Bivariate analysis is the study of two variables and their relationships. A multivariate analysis is one in which three or more variables are considered at the same time.

There are three common methods for performing univariate analysis:

- Summary Statistics
- Frequency Distributions
- Charts

#### Summary Statistics

The most common way to perform the univariate analysis is to use summary statistics to describe a variable. There are two kinds of summary statistics:

*Measures of central tendency:*These values describe where the dataset's center or middle value is located. The mean, mode, and median are examples.*Dispersion measures:*These numbers describe how evenly distributed the values are in the dataset. The range, standard deviation, and variance are some examples.

#### Frequency Distributions

A frequency distribution describes how frequently different values occur in a dataset. This acts as another way to perform univariate analysis.

#### Charts

Another method for performing univariate analysis is to create charts that show the distribution of values for a specific variable.

Some common examples are:

- Boxplots
- Histograms
- Density Curves
- Pie Charts

### Mean, Median, and Mode

In a numerical data set, the mean, median, and mode are three different measures of center. They are all attempting to summarize a dataset with a single number representing a typical data point from the dataset.

#### Mean

There are many different types of means, but most people refer to the arithmetic mean when they say mean. The **arithmetic mean**, also known as the mathematical mean, is determined by adding all the given data points and then dividing by the total number of data points.

*Mean = sum of all given data points / total number of data points*

Here is a more formalized version of the same formula:

{eq}\overline{X} {/eq}= {eq}\frac{\sum x_{i}}{n} {/eq}

- Example 1
- Find the mean of the data: 1, 2, 3, 4, 5.

*Mean = sum of all given data points / total number of data points*

Start by adding the data:

1 + 2 + 3 + 4 + 5 = 15

There are a total of 5 numbers.

15/5 = 3

The mean is 3.

- Example 2
- What is the mathematical mean of the following numbers? 10, 6, 4, 4, 6, 4.

*Mean = sum of all given data points / total number of data points*

Start by adding the data:

10 + 6 + 4 + 4 + 6 + 4 = 34

There are a total of 6 numbers.

34/6 = 5.66

The mean is 5.67.

#### Median

The **median** of the data is the value of the middlemost observation that is acquired after organizing the data in ascending or descending order.

Step 1: Sort the data in ascending or descending order.

Step 2: Determine whether n (number of observations) is even or odd. If n is an odd number, use the following formula:

{eq}Median=\left ( \frac{n+1}{2} \right )^{th}\text{observation} {/eq}.

If n is an even number, use the following formula:

{eq}Median=\frac{\left ( \frac{n}{2} \right )^{th}\text{observation}+\left ( \frac{n}{2}+1 \right )^{th}\text{observation}}{2} {/eq}.

- Example 1
- Find the median of 56, 67, 54, 34, 78, 43, 23.

Arranging in ascending order: 23, 34, 43, 54, 56, 67, 78.

Here, n = 7.

Using the median formula for odd data points:

{eq}Median=\left ( \frac{n+1}{2} \right )^{th} \text{observation} {/eq}

{eq}Median=\left ( \frac{7+1}{2} \right )^{th} \text{observation} {/eq}

{eq}Median= 4th\;\text{observation} {/eq}

The median is 54.

- Example 2
- Find the median of this data: 50, 67, 24, 34, 78, 43.

Arranging in ascending order: 24, 34, 43, 50, 67, 78.

Here, n = 6.

Using the median formula for even data points,

{eq}Median=\frac{\left ( \frac{n}{2} \right )^{th}\text{observation}+\left ( \frac{n}{2}+1 \right )^{th}\text{observation}}{2} {/eq}

{eq}Median=\frac{\left ( \frac{6}{2} \right )^{th}\text{observation}+\left ( \frac{6}{2}+1 \right )^{th}\text{observation}}{2} {/eq}

{eq}Median=\frac{\left ( 3 \right )^{th}\text{observation}+\left ( 4 \right )^{th}\text{observation}}{2} {/eq}

{eq}Median=\frac{43+50}{2} {/eq}

The median is 46.5.

#### Mode

A **mode** of data is defined as the value that appears the most frequently in the given data.

- Example 1
- A teacher asked her students how many siblings they each had. Look for the most frequent value. Two students report having no siblings, six report having one, three report having two, one reports having three, and one reports having four. Determine the data's mode.

Look for the most frequent value.

0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 4.

The mode is 1 sibling.

### Variance and Standard Deviation

In statistics, two important measures are variance and standard deviation. The **standard deviation** of statistical data is a measure of its distribution, whilst the **variance** is a measure of how the data points differ from the mean. The main difference is that the standard deviation is expressed in the same units as the mean of the data, while the variance is expressed in units squared.

#### Variance and Standard Deviation Formulas

The population variance formula is {eq}\sigma^{2}=\frac{1}{N}\sum_{i=1}^{N}\left ( X_{i}-\mu \right )^{2} {/eq}, where:

- {eq}\sigma^{2} {/eq}= Population variance

- N = Number of observations in population
- Xi = ith observation in the population
- {eq}\mu {/eq} = Population mean

Sample variance formula is {eq}s^{2}=\frac{1}{n-1}\sum_{i=1}^{n}\left ( x_{i}-\overline{x} \right )^{2} {/eq}, where:

- {eq}s^{2} {/eq} = Sample variance

- n = Number of observations in a sample
- xi = ith observation in the sample
- {eq}\overline{x} {/eq} = Sample mean

The population standard deviation formula is {eq}\sigma=\sqrt{\frac{1}{N}\sum_{i=1}^{N}\left ( X_{i}-\mu \right )^{2} } {/eq}, where:

- {eq}\sigma {/eq} = Population standard deviation

The sample standard deviation formula is {eq}s=\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left ( x_{i}-\overline{x} \right )^{2}} {/eq}, where:

- s = Sample standard deviation

#### Sample vs. Population

It is critical to understand the distinction between a population and a sample when performing statistical tests. To calculate a population's standard deviation (or variance), measurements must be collected from everyone in the group being studied. Measurements must be taken from a subset of the population to form a sample.

- Example 1
- Assume one wants to find the age variance and standard deviation in a group of five close friends. The friends are 25, 26, 27, 30, and 31 years old.

{eq}\sigma^{2}=\frac{1}{N}\sum_{i=1}^{N}\left ( X_{i}-\mu \right )^{2} {/eq}

Firstly, find the mean age:

(25 + 26 + 27 + 30 + 31) / 5 = 27.8

Then, for each of the five friends, compute the deviations from the mean.

25 - 28 = -3

26 - 28 = -2

27 - 28 = -1

30 - 28 = 2

31 - 28 = 3

Now, take each difference from the mean, square it, then average the result.

{eq}\sigma^{2} {/eq} ={eq}\frac{\left (-3^{2} \right )+\left ( -2^{2} \right )+\left ( -1^{2} \right )+\left ( 2^{2} \right )+\left ( 3^{2} \right )}{5} {/eq}

= (9 + 4 + 1 + 4 + 9 ) / 5 = 5.4

{eq}\sigma =\sqrt{5.4} {/eq}

The standard deviation is 2.32, which is the square root of the variance. The standard deviation value indicates that the friends are 2.32 years apart in age on average.

In the preceding example, the group of five friends was assumed as a population. If we had treated it as a sample instead, then we would have to use the formula of sample variance and standard deviation

- Example 2
- What is the standard deviation for this data set {5, 5, 5, 5, 5}?

The standard deviation is the square root of the variance and can be a measure of how spread out numbers in a data set are from the mean. Think about the data set {5, 5, 5, 5, 5}. The mean is 5, but every number in the set is also 5. There is no difference between any number in this set and the mean, so the standard deviation (and variance) is simply 0.

### Range

The **range** of data in statistics is the spread from the lowest to the highest value in the distribution. It is a widely used measure of variation. Measures of variability provide descriptive statistics for summarizing data sets like measures of central tendency,

The range is determined by subtracting the lowest and highest values. A large range number indicates high variability in distribution, whereas a small range number indicates low variability. The range is calculated using the following formula:

{eq}R= H - L {/eq}, where:

- R = Range
- H = The highest value
- L = The lowest value

The range is the simplest way to calculate variability. Follow these steps to determine the range:

- Sort the data set values from low to high.
- Subtract the lowest value from the highest value.

This procedure applies whether the values are positive or negative, whole numbers or fractions.

- Example 1
- Find the range from the data set: 4, 6, 9, 3, 7.

{eq}R= H - L {/eq}

{eq}R= 9 - 3 {/eq}

{eq}R= 6 {/eq}

The range is 6.

## Lesson Summary

**Univariate data** assembles information around a single, random variable. Each variable is described separately, and it describes the response pattern of the variable. The salaries of industry workers is a simple example of univariate data. **Univariate statistics** examines only one variable at a time and does not compare variables to one another. **Descriptive statistics** is another name for univariate statistics. The **univariate analysis** will take data and summarize it before looking for patterns. The patterns found in this type of data can be described using central tendency measures, data dispersion, frequency distribution tables, pie charts, frequency polygon histograms, and bar charts.

The **arithmetic mean**, also known as the mathematical mean, is determined by adding all the given data points and then dividing by the total number of data points. The **median** of the data is the value of the middlemost observation that is acquired after organizing the data in ascending or descending order. A **mode** of data is defined as the value that appears the most frequently in the given data. The **variance**, in layman's terms, is a measure of how far a set of data is dispersed from its mean or average value. The spread of the given statistical data is measured by the **standard deviation**. The degree of scattering is calculated using the method of estimating the deviation of the data points. The **range** of data in statistics is the spread from the lowest to the highest value in the distribution. It is a widely used measure of variation.

To unlock this lesson you must be a Study.com Member.

Create your account

#### What are examples of univariate analysis?

Univariate analysis is the most fundamental type of statistical data analysis technique. The data in this case only has one variable and does not have to deal with a cause-and-effect relationship. For example, the analysis could look at a variable such as "age," "height," or "weight."

#### What does univariate mean in statistics?

Univariate data is a term used in statistics to describe data that consists of observations on only one characteristic or attribute. There is only one variable in univariate data. Univariate data describes the variable's response pattern.

#### What is the difference between univariate and bivariate data?

Univariate analysis does not look at more than one variable at a time or their relationship. Bivariate analysis is the study of two variables and their relationships.

### Register to view this lesson

### Unlock Your Education

#### See for yourself why 30 million people use Study.com

##### Become a Study.com member and start learning now.

Become a MemberAlready a member? Log In

Back### Resources created by teachers for teachers

I would definitely recommend Study.com to my colleagues. Itâ€™s like

**a teacher waved a magic wand and did the work for me.** I feel like itâ€™s a lifeline.