Covariance & Correlation: Equations & Examples

Instructor: Maria Airth

Maria has a Doctorate of Education and over 15 years of experience teaching psychology and math related courses at the university level.

Covariance and correlation are not the same, but they are closely related to each other. This lesson reviews these two statistical measures with equations, explanations, and real-life examples.

Why Do Covariance and Correlation Matter?

You are the new owner of a small ice-cream shop in a little village near the beach. You noticed that there was more business in the warmer months than the cooler months. Before you alter your purchasing pattern to match this trend, you want to be sure that the relationship is real.

How can you be sure that the trend you noticed is real? Covariance and correlation are two measures that can tell you, statistically, whether or not a real relationship exists between the outside temperature and the number of customers you have. In this way, you can make an informed choice about your purchasing pattern.

Covariance is a statistical measure that shows whether two variables are related by measuring how the variables change in relation to each other. This is clear when you break down the word. Co- as a prefix often indicates some sort of joint action (like co-workers, co-owners, coordinate) and variance refers to variation or change. So, covariance measures how two things change together. It tells you if there is a relationship between two things and which direction that relationship is in.

Correlation, like covariance, is a measure of how two variables change in relation to each other, but it goes one step further than covariance in that correlation tells how strong the relationship is.

Let's work through these two statistical measures one at a time to get a good understanding of them.

To get started, we'll assume that you gathered data on six different days and created this chart:

Temperature Number of Customers
98 15
87 12
90 10
85 10
95 16
75 7

Covariance

So, we know that covariance is the measure of whether or not two variables vary (or change) in a predictable way together. This could be positive covariance, meaning as one increases the other increases, or negative covariance, meaning that as one increases the other decreases.

The formula for covariance is:


null


Wow, it looks a bit scary! Don't worry. It isn't as scary as it looks.

Walking through this formula, we see that the covariance of the two variables (x,y) is equal to the sum of the products of the differences of each item and the mean of its variables all divided by one less than the total number of items in the set. The x and y with an overscore (line on top) represent the means of each variable.

Okay, that was a bit of a mouthful as well. Again, not as hard as it sounds.

First you need to find the mean of each variable. Typically, we call the first mentioned variable x, so that would be temperature, and the second variable y, that would be the number of customers in our example.

So, the mean of x is (98+87+90+85+95+75)/6= 88.33.

The mean of y is (15+12+10+10+16+7)/6= 11.67

Now you subtract each value from its respective mean and then multiply these new values together.


null


The next step is to add all the products together, which yields the value 125.66.

The final step is to divide by (n-1) = 6 - 1 = 5.

125.66/5 = 25.132

The covariance of this set of data is 25.132. The number is positive, so we can state that the two variables do have a positive relationship; as temperature rises, the number of customers in the store also rises.

What this doesn't tell us is how strong this relationship is. To find the strength, we need to continue on to correlation.

Correlation

To determine the strength of a relationship, you must use the formula for correlation coefficient. This formula will result in a number between -1 and 1 with -1 being a perfect inverse correlation (the variables move in opposite directions reliably and consistently), 0 indicating no relationship between the two variables, and 1 being a perfect positive correction (the variables reliably and consistently move in the same direction as each other).

The formula is:


null


The correlation coefficient is represented with an r, so this formula states that the correlation coefficient equals the covariance between the variables divided by the product of the standard deviations of each variable.

To unlock this lesson you must be a Study.com Member.
Create your account

Register for a free trial

Are you a student or a teacher?

Unlock Your Education

See for yourself why 30 million people use Study.com

Become a Study.com member and start learning now.
Become a Member  Back
What teachers are saying about Study.com
Free 5-day trial

Earning College Credit

Did you know… We have over 160 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Transferring credit to the school of your choice

Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

Create an account to start this course today
Try it free for 5 days!
Create an account
Support