Patterns in Bivariate Data

Instructor: Bob Bruner

Bob is a software professional with 24 years in the industry. He has a bachelor's degree in Geology, and also has extensive experience in the Oil and Gas industry.

In this lesson we will use bivariate analysis to discover patterns between two attributes. After you read this lesson, you should be able to discern between positive and negative correlations, linear and non-linear associations, and understand the effect of data outliers.

Comparing Two Attributes

Suppose you went to a company picnic and were chosen to pick sides for a softball game. You might simply choose your best friends, but if you really wanted to win the game, how would you go about picking your team? One strategy might be to pick the most athletic team; perhaps choosing everyone you knew who regularly worked out at the gym. But would this factor really correlate particularly well to playing softball? In a real world situation, a bivariate analysis provides a way to help determine if two separate attributes have a meaningful relationship.

Scatter Plots

A bivariate analysis is the simultaneous comparison of two separate attributes. The most basic technique used is the creation of a scatter plot, or cross plot. In this case a two-dimensional graph is created that places one of the values being measured on the X-axis, and the other value on the Y-axis. Each data point is plotted at the location of its two measurements. By looking at all of the data together in this manner, different patterns will start to appear depending on how similar or dissimilar the two variables are.

Positive and Negative Associations/Correlations

In scientific studies we are often looking for data that we do expect to have a relationship. For example, if we plot everyone's weight against his or her height, we expect that weight will typically increase with height. We refer to this as a positive association, or positive correlation, where an increase in one variable is accompanied by an increase in the other variable. Similarly, when two variables are related but in an opposite manner, we refer to this as a negative association or negative correlation. In this case, an increase in one variable will show a decrease in the other. There are also cases in nature where a positive correlation can be found over one range of the data, and a negative correlation in another range.

Linear and Non-linear Associations/Correlations

In some cases correlations show very distinct linear trends. A linear association, or linear correlation, is apparent when a single straight line can be drawn through the individual data points with a good overall fit. In some cases different areas of the scatter plot may show two or more linear trends with different slopes. In other cases non-linear association, or non-linear correlation, are apparent, and a curvilinear line fits the individual points better.

If the two attributes do not correlate very well, then the scatter plot will show a random pattern. Random data simply shows up as a scattering of points without any discernible pattern, similar to what you might see if you scattered a handful of salt on to a countertop.

Clustering and Outliers

A scatter plot can often show distinct clustering of data. In clustering we find particular areas of the scatter plot that have distinct concentrations of data points. Other than an overall positive or negative correlation, clustering of the data into distinct regions is often the most discernible pattern seen in a scatter plot. Clustered data can be an indication that some factor other than the two attributes being measured is having an overall effect on the data.

To unlock this lesson you must be a Study.com Member.

Register to view this lesson

Are you a student or a teacher?

See for yourself why 30 million people use Study.com

Become a Study.com member and start learning now.
Back
What teachers are saying about Study.com

Earning College Credit

Did you know… We have over 200 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.