Login

Normal Distribution: Definition, Properties, Characteristics & Example

An error occurred trying to load this video.

Try refreshing the page, or contact customer support.

Coming up next: Finding Z-Scores: Definition & Examples

You're on a roll. Keep up the good work!

Take Quiz Watch Next Lesson
 Replay
Your next lesson will play in 10 seconds
  • 0:02 The Normal Distribution
  • 2:13 The Empirical Rule
  • 6:44 Putting It All Together
  • 10:08 Lesson Summary
Timeline
Autoplay
Autoplay
Create an account to start this course today
Try it free for 5 days!
Create An Account

Recommended Lessons and Courses for You

Lesson Transcript
Instructor: Rudranath Beharrysingh
In this lesson, we will look at the Normal Distribution, more commonly known as the Bell Curve. We'll look at some of its fascinating properties and learn why it is one of the most important distributions in the study of data.

The Normal Distribution

Jane is about to take an SAT. The school she is applying for gives priority to candidates whose SAT scores are in the 84th percentile or above. Jane wonders what she should score on the test to achieve this.

Sam is designing an electric car. To design it properly, he needs to know how long 95% of the lithium ion batteries will last.

What do these questions have in common? They can be solved with a greater understanding of the normal distribution. The normal distribution is a continuous distribution of data that has the shape of a symmetrical bell curve. It's also known as the Bell Curve. It is also called the Gaussian Distribution, after Carl Gauss who created a mathematical formula for the curve.

So, what's so special about this curve? A lot of data in nature have this shape when compiled and graphed. For example, heights and weights of men and women have this distribution. Standardized test scores are normally distributed. Sometimes lifespans of manufactured parts or equipment form a normal distribution.

By compiling the data into a frequency table and graphing in a histogram, we can often see this phenomena. Notice that the normal distribution, or curve, has a bell shape and is symmetrical:

normal distribution curve

This is a property of the normal distribution. Another property is that 'mean = median = mode.' This is because the shape of the data is symmetrical with one peak.

And, since the curve is symmetrical, the mean or median or mode (which are all the same number for this distribution) divide the data in half. From now on, we will just refer to this value in the middle as the mean:

Mean shown on normal distribution
mean shown on graph

However, note that the symbol Mu represents a population mean, and x bar represents a sample mean.

The Empirical Rule

The spots on the bell curve that have the steepest slope up and down (called inflection points) are very significant. The corresponding points on the horizontal axis are one standard deviation from the mean, and 68% of the data lie in here!

So what does that mean? (No pun intended). Well, suppose heights of men are normally distributed with an average or mean height of 68.5 inches and a standard deviation of three inches. We can generalize that 68% of men are between 68.5 - 3 = 65.5 inches and 68.5 + 3 = 71.5 inches tall! That's quite a generalization, but it is perfectly true if the data is normally distributed!

We mentioned standard deviation. The standard deviation is a measure of spread or variability of the data. The larger it is, the more spread out the data is. The standard deviation is calculated slightly differently for a population as opposed to a sample. The formulas and symbols for both types are given below:

sample and population standard deviation formulas

Let's look at the sample standard deviation (called S). It says S is equal to the square root of the sum (of each value minus the mean (called x bar) all squared) divided by n minus 1, which is the number of values minus 1.

For the population, the standard deviation symbol is called Sigma, and the only difference in the calculation is you subtract the population mean Mu from each value, and there is a division by the population size called big N.

This calculation can be tedious, but many statistical programs can easily calculate the standard deviation. For this video, we will refer to the standard deviation as std. dev., regardless of whether we are talking about a sample or a population.

More importantly, the standard deviation is a measure of spread. We can think of data in terms of distance from the mean, or in terms of standard deviations or tick marks! And, the normal curve has the property that 68% of the data lay within one standard deviation of the mean.

Is that it? No. There's more! 95% of the data lie within two standard deviations of the mean.

For example, suppose the lifespans of lithium ion batteries are normally distributed with a mean lifespan of 20,000 hours and a standard deviation of 1000 hours. We can conclude that 95% of these batteries will last between 20,000 - (2 * 1000) = 18,000 hours and 20 + (2 * 1000) = 22,000 hours.

Is there another part to this rule? You betcha, and it says that 99.7% of the data is within three standard deviations of the mean, which pretty much captures all of the data except for 0.3%! And, you can see, this means there is not much data left over in the tails of the curve:

99.7% of the data is within 3 standard deviations.
bell curve showing battery example data

For example, if SAT scores are normally distributed with a mean score of 550 and a standard deviation of 80 points, we could generalize that 99.7% of SAT scores are between 550 - (3 * 8) = 310 and 550 + (3 * 80) = 790.

The generalizations about the percentage of data within certain standard deviations from the mean is called the empirical rule, or the 68-95-99.7 rule, and it says that for normally distributed data, 68% of the data is within one standard deviation of the mean, 95% of the data is within two standard deviations of the mean and 99.7% of the data is within three standard deviations of the mean.

Putting It All Together

These percentages can be broken down further. Since the curve is symmetrical and 68% of the data is within one standard deviation of the mean, half of 68% or 34% of the data must lie to the left and to the right of the mean within one standard deviation. Similarly, the area between one standard deviation and two standard deviations will be 95% - 68% = 27%. However, the curve is symmetrical. And so, this can be halved to give 13.5% of the data between one standard deviation and two standard deviations on each side. And, a similar calculation can be done for the area between two and three standard deviations from the mean. This is 99.7 - 95 = 4.7%, then 4.7% / 2 = 2.35% on each side of the curve. A summary of these percentages is shown in the graph below:

Percentages of data within standard deviation
percentages of data on bell curve

To unlock this lesson you must be a Study.com Member.
Create your account

Register for a free trial

Are you a student or a teacher?
I am a teacher
What is your educational goal?
 Back

Unlock Your Education

See for yourself why 10 million people use Study.com

Become a Study.com member and start learning now.
Become a Member  Back

Earning College Credit

Did you know… We have over 79 college courses that prepare you to earn credit by exam that is accepted by over 2,000 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Transferring credit to the school of your choice

Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

Create an account to start this course today
Try it free for 5 days!
Create An Account
Support