Karin has taught middle and high school Health and has a master's degree in social work.
In this lesson, you'll learn about skewness in statistics, including what data distribution and bell curves look like with and without skew. After that, you'll learn a formula to calculate skew, and then you can test your knowledge with a brief quiz.
Definition of Skewness
Skewness in statistics represents an imbalance and asymmetry from the mean of a data distribution. If you look at a normal data distribution using a bell curve, the curve will be perfectly symmetrical. Now, this doesn't happen all that often! In order to fully understand when a data distribution is imperfect and skewed, let's look at a normal data distribution and symmetrical bell curve.
First, let me remind you of a few basic terms
Mean is the average of the numbers in the data distribution
Median is the number that falls directly in the middle of the data distribution
Mode is the number that appears most frequently in the data distribution
In a normal data distribution, the mean is directly in the middle (and top point) of the bell curve. Imagine that Mrs. Thomas wanted to teach her high school statistics class on the first day about data distributions, standard deviations, and bell curves. She asks her 16 student class to secretly divulge their summer job incomes. Each student provides Mrs. Thomas with a piece of paper with their income. She rounds each income level to the nearest 500 and makes a chart.
Now that we see the data on a chart, we can see that four of the students made about $2,000 in total over the summer. If we find the mean, we see that it is $2,000. The mode and median in this data distribution also happen to be $2,000. In a normal data distribution and perfectly symmetrical bell curve, the median and mean are always the same value. Take a look at the graph of the data which represents a normal bell curve (no skewness at all!).
Properties of Skewed Bell Curves
In a symmetric bell curve, the mean, median, and mode are all the same value. How easy is that? But in a skewed distribution, the mean, median, and mode are all different values. You can see this represented in this image:
A skewed data distribution or bell curve can be either positive or negative. A positive skew means that the extreme data results are larger. This skews the data in that it brings the mean (average) up. The mean will be larger than the median in a skewed data set. A negative skew means the opposite: that the extreme data results are smaller. This means that the mean is brought down, and the median is larger than the mean.
Formula for Skewness
The formula to find skewness manually is this:
skewness = (3 * (mean - median)) / standard deviation
In order to use this formula, we need to know the mean and median, of course. As we saw earlier, the mean is the average. It's the sum of the values in the data distribution divided by the number of values in the distribution. And if the data distribution was arranged in numerical order, the median would be the value directly in the middle.
Now, you may be asking: What is standard deviation? Standard deviation tells you how different and varied your data set really is. Standard deviation shows you how far your numbers spread out from the mean and median. Here is the formula to find standard deviation:
Examples of Skewness
Example 1: Zero Skewness
Taking the example from earlier (student summer income), we have the following 16 values in our data set (all are in dollars):
The mean would be the sum of these values divided by the number of values. The sum of all the values is 32,000. Take that and divide it by 16 (number of values) and it equals 2,000. The mean equals 2,000. The median is the number directly in the middle, which is 2,000. Use the standard deviation formula (or find a standard deviation calculator on the internet) and you would get 816.5. If you plug this into the skewness formula, you would get:
(3 * (2000-2000)) / 816.5 = 0
Amazing! Like we said, this example is an example of a perfect data distribution with a symmetrical bell curve and zero skewness.
Example 2: Positive Skewness
Let's still take the same data distribution, but instead of $3500 for the last value, let's say that this student made $8000 instead. How does this change the bell curve? The median would still be 2000, but the new mean would be 2281. The new standard deviation would be 1682.94. This would mean the skewness was:
skewness = (3 * (2281-2000)) / 1682.94 = .5
The skewness equals .5 (positive skew). If we look at the image earlier in the lesson of the positive skewed bell curve above, it makes complete sense the tail is extending to the right. It is reaching out for that extreme $8000 value (which some call an outlier).
Skewness in statistics represents an imbalance and an asymmetry from the mean of a data distribution. In a normal data distribution with a symmetrical bell curve, the mean and median are the same. In a skewed data distribution, the median and the mean are different values. A positive skew means that the extreme data results are larger, bringing the average up and making it larger than the median. A negative skew means the opposite: that the extreme data results are smaller, bringing the average down and making the median larger than the mean.
The formula to find skewness manually is:
skewness = (3 * (mean - median)) / standard deviation
Standard deviation tells you how different and varied your data set really is. Standard deviation can be found with the following equation:
Now with this information, you can discover which data sets you're presented with have a skew!
Skewness - concept in statistics which represents an imbalance and asymmetry from the mean of a data distribution
Positive skew - term used to state that the extreme data results are larger
Negative skew - term used to state that the extreme data results are smaller
Standard deviation - formula to determine how different and varied a data set is
After this video, check to see if you can:
Remember the definitions of the terms skewness in statistics as well as mean,median and mode
Determine whether a graphed data set has a positive, negative, or no skew
Find the skew of a data set using the skew formula
Did you know… We have over 200 college
courses that prepare you to earn
credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the
first two years of college and save thousands off your degree. Anyone can earn
credit-by-exam regardless of age or education level.