Mia has taught math and science and has a Master's Degree in Secondary Teaching.
In this lesson, you will learn the definition of a data distribution. We will look at examples and features of various types of data distributions. You can then take a brief quiz to see what you learned.
What Is a Data Distribution?
Meet Mia. As a part of a college research course, she collected and organized information about students on campus. She was so proud of the amount of information she collected that Mia couldn't wait to share it with her professor! But first, she had to organize the data in a way that was useful and concise. To do this, Mia created a data distribution.
Data distributions are used often in statistics. They are graphical methods of organizing and displaying useful information. There are several types of data distributions. In this lesson, we will focus on dot plots, histograms, box plots, and tally charts.
Dot plots show numerical values plotted on a scale. Each dot represents one value in the set of data. In the example below, the customer service ratings range from 0 to 9. The dots tell us the frequency, or rate of occurrence, of customers who gave each rating. If you look at the 5 rating, you can see that three customers gave that rating, and if you look at a score of 9, eight customers gave that rating. We can also see that ratings were provided by fifty customers, one dot for each customer.
Example of a dot plot
Now imagine that ratings were provided by five hundred customers. It would not be practical or useful to have a distribution of five hundred dots. For this reason, dot plots are used for data that have a relatively small number of values.
Histograms display data in ranges, with each bar representing a range of numeric values. The height of the bar tells you the frequency of values that fall within that range. In the example below, the first bar represents black cherry trees that are between 60 and 65 feet in height. The bar goes up to three, so there are three trees that are between 60 and 65 feet.
Example of a histogram
Histograms are an excellent way to display large amounts of data. If you have a set of data that includes thousands of values, you can simply adjust the frequency interval to accommodate a larger scale, rather than just 0-10.
Over 79,000 lessons in all major subjects
Get access risk-free for 30 days,
just create an account.
Box plots are rectangular representations of data. They do not give the frequency of values, but they provide other useful information about a set of data. Because of the markings used, box plots are often called box-and-whisker plots. Reading this type of data representation is easy. The box portion represents the middle 50% of the data. The vertical line in the box is the median, or the middle value of the data set. The left and right whiskers that extend from the box represent the lower and upper 25% of the data, respectively.
In the example below, the green line tells us that 25% of US states are between 0 and 100,000 square miles in area. The red line represents the 25% of US states that have an area of approximately 220,000 to 1,550,000 square miles. The box shows that the middle 50% are between 100,000 to 220,000 square miles, with a median value of approximately 150,000 square miles.
Example of a box plot
While box plots provide useful statistical information about a data set, they do not provide the number or frequency of values like histograms or dot plots do.
A tally chart consists of a table with tally marks that show frequency of occurrences in each category. A line is marked for each occurrence. Every fifth line is marked through the previous four to designate a group of five marks. This becomes useful when counting the markings in each category. In the tally chart below, the data shows the frequency of painting problems. By counting the number of tally marks, we can see that there were thirteen occurrences of paint chipping, three occurrences of bubbles, etc.
Example of a tally chart
Tally charts are a convenient way to organize data as it is being collected and can be used for any type of data. However, it would not be useful for collecting and organizing large amounts of data.
Data distributions are used to organize and display information about a set of collected data. Common distributions include tally charts, dot plots, box plots, and histograms. Selecting an appropriate distribution will depend on the type and amount of data that will be displayed since each distribution has different strengths and weaknesses.
Did you know… We have over 200 college
courses that prepare you to earn
credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the
first two years of college and save thousands off your degree. Anyone can earn
credit-by-exam regardless of age or education level.