Chi-Square Test of Independence: Example & Formula

Instructor: Bob Bruner

Bob is a software professional with 24 years in the industry. He has a bachelor's degree in Geology, and also has extensive experience in the Oil and Gas industry.

A chi-square test of independence can be used to calculate and analyze data for differences between observed and expected measurements of categorical data. This lesson provides formulas and examples for use of a chi-square test.

Evaluating Associations with Chi-Square

If you were comparing grades with your friends after class one day, and it suddenly occurred to you that everyone who wore glasses seemed to get the best grades, could you determine if this was actually true? You can validate this data statistically by creating and evaluating a chi-square test, which measures the differences between expected and observed frequencies of data.

Categorical Variables

A pre-condition of any chi-square test is that it uses measurements of categorical variables, which are counts of data taken from discrete, mutually exclusive categories. In our case, we have two categories of eyesight: Wears Glasses or No Glasses, and every student sampled needs to fit into one of those two categories. Our grades also need to fall into specific categories, such as A, B, C, D, F letter grades, rather than using percentages, or some other numeric measurement.

There are various types of tests that use the chi-square statistic. In this lesson, we will be deriving a chi-square test of independence, which determines whether two categorical variables are related to one another. We use a chi-square computation and chi-square probabilities to make that assessment.

Calculation Essentials

The chi-square statistic is based upon the variance of each observation with the count that would be expected if there is no relationship between the variables. This comparison is often referred to as the null hypothesis. Stated another way, the null hypothesis simply declares that the two variables are independent. A small variance confirms the null hypothesis, indicating the variables are not related, while a large variance indicates that the variables are related.

The chi-square formula uses the sum of the differences squared to compute the variance. Note that we must provide an expected value for each category. An average can be used if we expect the data to be randomly distributed, or the expected values can be taken from any assumed distribution curve, such as a normal, skewed, or logarithmic curve.


Chi-square


Here Oi are the observed counts and Ei are the expected counts.

In our case, let's assume there are 50 students in the class, and half of them wear glasses. In the table below we show the expected grade count, if the grades were normally distributed, and the actual grades. We generate a variance value for each pair of numbers as per the equation, and sum them to get our chi-square statistic, which in this case is 6.7444 + 9.1111 =15.8555.

Grade Expected Glasses Observed Glasses Variance Glasses Expected No Glasses Observed No Glasses Variance No Glasses
A 3 5 1.3333 3 1 1.3333
B 6 10 2.6667 6 2 2.6667
C 9 7 0.4444 9 10 0.1111
D 5 2 1.8000 5 10 5.0000
F 2 1 0.5000 2 2 0.0000
Total Variance 6.7444 9.1111

Analyzing the Result

In this example, our null hypothesis is that there is no correlation between wearing glasses and the test grades anyone receives. First, consider the unlikely case that the observed values are exactly the same as the expected values, which would give a value of 0 in all the calculations. Zero difference or variance indicates that the null hypothesis is exactly true, and the two variables are totally independent.

In this case, what does a number such as 15.8555 mean? In order to analyze the chi-square value, we first need to find the number of degrees of freedom in the data. The degrees of freedom is a measure of how much an observation can vary, and is calculated as being one less than the number of levels in each category, multiplied together. In our case we have 2 categories of eyesight and 5 categories of grades, and the degrees of freedom is computed as:

To unlock this lesson you must be a Study.com Member.
Create your account

Register to view this lesson

Are you a student or a teacher?

Unlock Your Education

See for yourself why 30 million people use Study.com

Become a Study.com member and start learning now.
Become a Member  Back
What teachers are saying about Study.com
Try it risk-free for 30 days

Earning College Credit

Did you know… We have over 200 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Transferring credit to the school of your choice

Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

Create an account to start this course today
Try it risk-free for 30 days!
Create an account
Support