Back To Course

Statistics 101: Principles of Statistics11 chapters | 144 lessons | 9 flashcard sets

Are you a student or a teacher?

Try Study.com, risk-free

As a member, you'll also get unlimited access to over 75,000 lessons in math, English, science, history, and more. Plus, get practice tests, quizzes, and personalized coaching to help you succeed.

Try it risk-freeWhat teachers are saying about Study.com

Already registered? Login here for access

Your next lesson will play in
10 seconds

Lesson Transcript

Instructor:
*Cathryn Jackson*

Cat has taught a variety of subjects, including communications, mathematics, and technology. Cat has a master's degree in education and is currently working on her Ph.D.

Linear regression can be a powerful tool for predicting and interpreting information. Learn to use two common formulas for linear regression in this lesson.

Jake has decided to start a hot dog business. He has hired his cousin, Noah, to help him with hot dog sales. But there's a problem! Noah can only work 20 hours a week. Jake wants to have Noah working at peak hot dog sales hours. How can he find this information? In this lesson, you will learn how to solve problems using concepts based on linear regression. First, let's check out some of our key terms that will be beneficial in this lesson.

Jake will have to collect data and use regression analysis to find the optimum hot dog sale time. **Regression analysis** is the study of two variables in an attempt to find a relationship, or correlation. For example, there have been many regression analyses on student study hours and GPA. Studies have found a relationship between the number of hours a student studies and their overall GPA.

In other words, the number of hours a student studies is the independent variable and the GPA is the dependent variable. The student's GPA will depend on the number of hours a student studies; therefore, there is a relationship between the two variables. We'll talk more about this relationship, also known as correlation, in a minute, but let's define linear regression next.

A **regression line** is a straight line that attempts to predict the relationship between two points, also known as a trend line or line of best fit. You've probably seen this line previously in another class. **Linear regression** is a prediction when a variable (*y*) is dependent on a second variable (*x*) based on the regression equation of a given set of data.

To clarify, you can take a set of data, create a scatter plot, create a regression line, and then use regression analysis to see if you have a correlation. Once you have your correlation, you have linear regression. Okay, that probably sounded like Greek to you. Let's talk a little bit about correlation before looking at some examples.

A **correlation** is the relationship between two sets of variables used to describe or predict information. The stronger the relationship between the two sets of variables, the more likely your prediction will be accurate. We will examine this concept of correlation more closely in other lessons, such as Interpreting Linear Relationships Using Data and Correlation Versus Causation. For now, let's focus on using the regression line to help solve Jake's hot dog sales dilemma.

First, let's look at the data for Jake's hot dog sales. Jake has been working for the past few weeks from 1 pm to 7 pm each day. Each day, Jake has tracked the hour and the number of hot dog sales for each hour. Take a look at this data set for Monday:

(1, 10) (2, 11) (3, 15) (4, 12) (5, 17) (6, 18) (7, 20)

To establish the relationship between the time of day and the number of hot dogs sold, Jake will need to put the data into the formula *y* = *ax* + *b*. You've probably seen the formula for slope intercept form in algebra: *y* = *mx* + *b*. This is the same formula, but in statistics, we've replaced the *m* with *a*; *a* is still slope in this formula, so there aren't any big changes you need to worry about.

To find the regression line for this data set, let's first put this information into a chart like this:

Now we need to use the least squares formula to find our variables in *y* = *ax* + *b*. This is the formula to find the slope *a*:

I know, it looks pretty complicated. First, we will need a little more information in our chart. Look at how I expanded the chart to include *x* times *y* and *x* squared:

Look at the first row in this chart. In the first column I have 1, my value for the first *x* in the data set. In the second column, I have 10, my value for the first *y* in the data set. Under *x* times *y*, I have 10, which is the product of the first and second column. In the last column, under *x* squared, I have 1, which is the value of the first column squared. Notice that each row on this chart follows this same pattern.

Now, we've added a final row that shows the sum of each column:

For example, the first column has all of the *x* values: 1, 2, 3, 4, 5, 6, 7. The last row is the sum of all of those values: 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. So, let's review our chart; we have:

- All of the
*x*values - All of the
*y*values *x***y*for each ordered pair*x*^2 for each*x*value- The sum of
*x*,*y*,*x***y*, and*x*^2

Now that we have this information, let's look at our formula and our chart. I've circled the places in our formula with the corresponding values in our chart with similarly colored circles:

In this formula, *a* equals *n* times the sum of *x* times *y* minus the sum of *x* times the sum of *y* all divided by *n* times the sum of *x* squared minus parenthesis the sum of *x* end parenthesis squared.

Notice that *n* equals the number of ordered pairs. In this scenario, we have seven total ordered pairs. Therefore, our formula would look like this:

*a* = (7 * 458 - 28 * 103) / (7 * 140 - (28)^2), which equals

*a* = 322 / 196

*a* = 1.64

Therefore, our slope is 1.64.

Now let's use our chart to find the value for *b*, our line's intercept. Once again, I've circled the places in our formula with the corresponding values in our chart with similarly colored circles:

This formula reads *b* equals 1 divided by *n* times parenthesis the sum of *y* minus *a* times the sum of *x* end parentheses. Our formula would look like this:

*b* = (1 / 7) * (103 - 1.64 * 28)

*b* = 8.15

Therefore, our intercept is 8.15. Now you can graph your data set with the regression line like this:

So, what does this information tell Jake? Well, the intercept tells us that if he worked at 12 (which would be zero on the graph), he is likely to sell 8.15 hot dogs. More importantly, the slope tells us that as each hour passes, the likelihood that Jake will sell a hot dog increases by 1.64 hot dogs. The big difference in this problem compared to most linear regression problems is the hours.

In this case, we used the *x* axis as each hour on a clock, rather than a value in time. If you had a shift of hours that went from 8 am to 6 pm, I would recommend using military time to represent both am and pm with individual values and to show the relationship between the evening hours and the morning hours.

The later in the evening it is, the more hot dogs Jake will likely sell. He will want to have Noah working later in the evenings for his shifts because it is most likely going to be the busiest sales time.

Remember, **regression analysis** is the study of two variables in an attempt to find a relationship, or correlation. We found a correlation between the later evening hours and the hotdog sales. A **correlation** is the relationship between two sets of variables used to describe or predict information. To do this, we used **linear regression**, which is a prediction when a variable (*y*) is dependent on a second variable (*x*) based on the regression equation of a given set of data.

First, we had to find our regression line and its equation. A **regression line** is a straight line that attempts to predict the relationship between two points, also known as a trend line or line of best fit. We first created a chart with the following values:

- All of the
*x*values - All of the
*y*values *x***y*for each ordered pair*x*^2 for each*x*value- The sum of
*x*,*y*,*x***y*, and*x*^2

We then used these values in the following formulas to find the values for the equation: *y* = *ax* + *b*.

You can double check your work by using a graphing calculator to find the regression line of a data set. Check out our lesson on Simple Linear Regression to see how to do that!

Look to this lesson if you'd like to:

- Understand terms such as regression analysis, correlation and linear regression
- Find the regression line and its equation from a set of data
- Memorize the formulas for finding slope and intercept

To unlock this lesson you must be a Study.com Member.

Create your account

Are you a student or a teacher?

Already a member? Log In

BackWhat teachers are saying about Study.com

Already registered? Login here for access

Did you know… We have over 160 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

You are viewing lesson
Lesson
2 in chapter 8 of the course:

Back To Course

Statistics 101: Principles of Statistics11 chapters | 144 lessons | 9 flashcard sets

- Go to Probability

- Go to Sampling

- Creating & Interpreting Scatterplots: Process & Examples 6:14
- Problem Solving Using Linear Regression: Steps & Examples 8:38
- Interpreting the Slope & Intercept of a Linear Model 8:05
- The Correlation Coefficient: Definition, Formula & Example 9:57
- The Correlation Coefficient: Practice Problems 8:14
- How to Interpret Correlations in Research Results 14:31
- Correlation vs. Causation: Differences & Definition 7:27
- Interpreting Linear Relationships Using Data: Practice Problems 6:15
- Transforming Nonlinear Data: Steps & Examples 9:25
- Coefficient of Determination: Definition, Formula & Example 5:21
- Pearson Correlation Coefficient: Formula, Example & Significance 6:31
- Go to Regression & Correlation

- AFOQT Information Guide
- ACT Information Guide
- Computer Science 335: Mobile Forensics
- Electricity, Physics & Engineering Lesson Plans
- Teaching Economics Lesson Plans
- Social Justice Goals in Social Work
- Developmental Abnormalities
- Overview of Human Growth & Development
- ACT Informational Resources
- AFOQT Informational Resources
- AFOQT Prep Product Comparison
- ACT Prep Product Comparison
- CGAP Prep Product Comparison
- CPCE Prep Product Comparison
- CCXP Prep Product Comparison
- CNE Prep Product Comparison
- IAAP CAP Prep Product Comparison

- Cognition: Theory, Overview
- History of Sparta
- Realistic vs Optimistic Thinking
- How Language Reflects Culture & Affects Meaning
- News Report Project Ideas
- Volume of a Rectangular Prism Lesson Plan
- The Immortal Life of Henrietta Lacks Discussion Questions
- Quiz & Worksheet - Fezziwig in A Christmas Carol
- Octopus Diet: Quiz & Worksheet for Kids
- Quiz & Worksheet - Frontalis Muscle
- Logical Thinking & Reasoning Queries: Quiz & Worksheet for Kids
- Analytical & Non-Euclidean Geometry Flashcards
- Flashcards - Measurement & Experimental Design
- Informative Essay Topics for Teachers
- Classroom Management Resources for Teachers

- LSAT Test: Online Prep & Review
- TExES Mathematics 4-8 Exam (115): Study Guide & Review
- HESI Admission Assessment (A2) Exam
- Business 111: Principles of Supervision
- PLACE Early Childhood Education: Practice & Study Guide
- NMTA Social Science: Industrialization & Urbanization (1870-1900)
- NMTA Social Science: Demand, Supply & Market Equilibrium
- Quiz & Worksheet - Tiresias of The Odyssey
- Quiz & Worksheet - Shakespeare's The Tragedy of Othello
- Quiz & Worksheet - Function of a LAN Card
- Quiz & Worksheet - Self-Monitoring in Psychology
- Quiz & Worksheet - Coordinate Covalent Bonds

- Equivalence Point: Definition & Calculation
- Figurative Language in The Book Thief
- Word Games for Kids
- Memoir Lesson Plan
- WIDA Can Do Descriptors for Grade 1
- Shays' Rebellion Lesson Plan
- Dolch Sight Words Games & Activities
- What is the Official SAT Website?
- ESL Content Standards in Illinois
- How to Save for College
- Telling Time: Activities & Games for Kids
- Texas Teacher Certification Renewal

- Tech and Engineering - Videos
- Tech and Engineering - Quizzes
- Tech and Engineering - Questions & Answers

Browse by subject