Two-Tailed Test Uses, Formula and Examples
Two-Tailed Test in Statistics
In statistics, significance tests are used to determine if there is a statistically significant difference between observed values and the expected values of a statistical experiment. In any experiment, it is possible to have differences between observed and expected values. Significance tests provide the evidence of whether the difference occurs because of random factors or if it is unlikely that the difference is purely because of random factors.
For example, a breakfast cereal company produces boxes of cereal. On average, each box is 200 grams. Sixty boxes of cereal were chosen randomly and weighed. It is found that on average each box contains 185 grams of cereal. There is a 15 gram difference between observed and expected values. Is this an acceptable difference? Is the difference too high, and is there something in the production line that needs to be corrected? As neither extreme (extremely higher or lower than the expected average weight) is desired, a two-tailed test will need to be conducted. The null hypothesis for the weight of each cereal box is
{eq}H_{0} {/eq}: The mean weight of a cereal box is {eq}\mu {/eq}=200g
A two-tailed test compares the sample and population means to identify if the difference between their means is statistically significant. Although both population and sample are assumed to be normally distributed, their means are different. To compare the two standard normal distributions, z-scores will need to be used. In Figure 1, the standard normal distribution curve represents the two-tailed test. If the result of the two-tailed test falls into the unshaded region, then it means there is not enough evidence to reject the null hypothesis. In the case of cereal boxes, there is not enough evidence to say the sample mean is different from 200 grams, or in other words, the difference between the sample and the population means is not statistically significant. If the two-tailed test results fall into the shaded regions, which are also called the rejection regions, then there is enough evidence to reject the null hypothesis, and the factory manager needs to check the process of production and make changes.
![]() |
Another important aspect in significance testing and two-tailed testing is the significance level, which is also known as the p-value. It represents the probability of obtaining test results at least as extreme as the results observed, under the assumption that the null hypothesis is correct. In Figure 1, it is the shaded regions that represent this probability. Because two-tailed statistics is represented in the graph, each shaded region has a probability equal to {eq}\frac{p}{2} {/eq}.
Also, in Figure 1, z-scores corresponding to each tail {eq}\frac{p}{2} {/eq} are {eq}z_{a}\: and\; z_{b} {/eq} (or corresponding x values {eq}x_{a}\: and\; x_{b} {/eq}) are called critical values, and regions are called critical regions.
In summary, two tailed statistics is used if there is a statistically significant difference between the sample mean and the population mean.
Two-Tailed vs One-Tailed Tests
For example, a factory produces smartphone batteries with a mean life of 10 hours. If a sample of 35 batteries has a mean life of 8 hours, is it significantly less than the expected mean?
{eq}H_{0} {/eq}: The mean life of the batteries is {eq}\mu {/eq}=10 hrs
{eq}H_{a} {/eq}: the mean life of batteries is fewer than 10 hours, {eq}\mu {/eq}<10 hrs
In this case, a one-tailed significance test is suitable, as the question is asking, "Is it significantly less than the expected mean?" In Figure 2, this case is represented with a left-tailed rejection region under the curve of standardized normal distribution.
![]() |
Another example of a one-tailed test would be for a course for four-year-old children which claims to be improving their memory. On average, the children can remember seven of the ten pictures shown to them correctly. After they have taken the course, they remember eight pictures out of ten on average. Is there enough evidence to say the course has improved their memory?
In this case, a right-tailed test would be more suitable, as the question of the research is to find if the course has increased memory, as seen in Figure 3.
![]() |
Two-Tailed Test Formula
The steps of two-tailed test statistics are as follows:
- The sample size needs to be large enough to perform significance testing (statistically acceptable minimum is n=30).
- State the null and alternative hypothesis, where the null hypothesis is that the sample mean is equal to the population mean, and the alternative hypothesis is they are not equal.
- Identify the significance level (p-value).
- Find the critical values using the standard normal distribution table (area beyond z for the right tail of the graph).
- Perform the test and find z-scores for observed values.
- If the z-score falls into the critical region, reject the null hypothesis.
If
{eq}n: {/eq} sample size
{eq}\sigma {/eq}: standard deviation
{eq}\bar{x} {/eq}: sample mean
{eq}\mu {/eq}: the population mean
The two-tailed test formula to find the z-score is
{eq}z=\frac{\bar{x}-\mu }{\sigma/\sqrt{n}} {/eq}
In this formula, {eq}\frac{\sigma}{\sqrt{n}} {/eq} represents the standard error of the sampling distribution.
As mentioned above, the z-score is compared to the p-value and the rejection region. Fifteen percent (0.05) or one percent (0.01) are commonly used p-values. For example, if 0.01 is used, it means that only 1 time in 100 the null hypothesis is rejected when it should have been accepted. Therefore, a smaller p-value means the chance of rejecting the null hypothesis when it should not be rejected is smaller. The significance level needs to be decided before the test is performed based on the set criteria by the researcher.
In the example of the weight of cereal boxes, the sample mean was 185 grams, and the population mean was 200 grams.
The null hypothesis for the weight of each cereal box is
{eq}H_{0} {/eq}: The mean weight of a cereal box is {eq}\mu {/eq}=200 g
The alternative hypothesis is the negation of the null hypothesis which is
{eq}H_{a} {/eq}: {eq}\mu \neq 200 g {/eq}
For a significance level of 0.05, the rejection regions will have 0.025 in both tails, which correspond to the z-scores -1.96 and 1.96. If the calculated z-value is between -1.96 and 1.96, as seen in Figure 4, then there is not sufficient evidence to reject the null hypothesis which means there is no need to change the production line. If the z-score falls into the region z>1.96 or z<-1.96, then there is sufficient evidence to reject the null hypothesis, and the production one needs to be reset.
![]() |
Two-Tailed Test Example
A two-tailed hypothesis test example:
A machine is used to fill bags with coffee, and each bag is 1 kg. A randomly selected sample of 30 bags has a mean weight of 1.01 kg with a standard deviation of 0.02 kg. Perform a two-tailed test for the significance level of 0.01 and decide if the machine needs to be adjusted.
- Given
{eq}n=30 {/eq}
{eq}\sigma=0.02 {/eq}
{eq}\bar{x}=1.01 kg {/eq}
{eq}\mu=1 {/eq}
p=0.05 and {eq}\frac{p}{2}=0.025 {/eq}
- The null and alternative hypotheses are
{eq}H_{0} {/eq}: {eq}\mu {/eq}=1 kg
{eq}H_{a} {/eq}: {eq}\mu \neq 1 kg {/eq}
- Crital z-values using the probability {eq}\frac{p}{2}=0.025 {/eq} from table critical regions are z<-1.96 and z>1.96
- Calculate the z-value using the formula {eq}z=\frac{\bar{x}-\mu }{\sigma/\sqrt{n}} {/eq}
{eq}z=\frac{1.01-1 }{0.02/\sqrt{30}}=2.739 {/eq} which falls into the critical region as seen in Figure 5. Therefore, there is sufficient evidence to reject the null hypothesis and the manufacturer will need to reset the machine.
![]() |
Lesson Summary
In statistics, the test of significance is used to determine if there is a statistically significant difference between observed values and the expected values of a statistical experiment. It is a statistical analysis designed to evaluate the probability of finding a particular value of a random variable as compared to the mean of all possible values. A two-tailed test compares the sample and population means to identify if the difference between their mean is statistically significant. In a two-tailed test, significantly different values can be found in both the upper and lower tail region, whereas in a one-tailed test, it can be found in only one of the upper or lower tail regions.
Another important aspect in significance testing and two-tailed testing is the significance level, which is also known as the p-value. It represents the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. The critical value, also called significance level, is the point on the normal curve at which the null hypothesis is rejected or failed to reject.
The steps of two-tailed test statistics are as follows:
- Sample size needs to be large enough to perform significance testing (statistically acceptable minimum is n=30).
- State the null and alternative hypothesis, where the null hypothesis is that the sample mean is equal to the population mean, and the alternative hypothesis is they are not equal.
- Identify the significance level (p-value).
- Find the critical values using the standard normal distribution table (area beyond z for the right tail of the graph).
- Perform the test and find z-scores for observed values.
- If the z-score falls into the critical region, reject the null hypothesis.
If
{eq}n: {/eq} sample size
{eq}\sigma {/eq}: standard deviation
{eq}\bar{x} {/eq}: sample mean
{eq}\mu {/eq}: the population mean
The two tailed test formula to find the z-score is
{eq}z=\frac{\bar{x}-\mu }{\sigma/\sqrt{n}} {/eq}
In this formula, {eq}\frac{\sigma}{\sqrt{n}} {/eq} represents the standard error of the sampling distribution.
To unlock this lesson you must be a Study.com Member.
Create your account
When would you use a two-tailed test?
A two-tailed test is used to identify if the mean of the expected values is significantly different than the mean of the observed values. This means that it is testing whether the means are equal, more, or less; and both tail ends of the normal distribution are considered.
What is the difference between a one-tailed test and a two-tailed test?
For both two-tailed and one-tailed tests, the null hypothesis is that the mean of the population is equal to the mean of the observed values.
In a one-tailed test, the alternative hypothesis is that the mean is smaller (or greater) than the expected mean.
In the two-tailed test, the alternative hypothesis is that the mean of the population is different from the observed mean.
What is an example of a two-tailed test?
The two-tailed test identifies if the difference between the sample mean and the population mean is statistically significant. For example, if a new teaching method of statistics is to be used in schools, it is important to use a two-tailed test to identify if there is a statistically significant difference in both ends of the distribution to see if the new method produced similar, higher, or lower scores than the previous method.
Register to view this lesson
Unlock Your Education
See for yourself why 30 million people use Study.com
Become a Study.com member and start learning now.
Become a MemberAlready a member? Log In
Back