12.1 Linear Equations
Oftentimes we’re interested in studying relationships between variables. For example, is there a relationship between the grade on the second math exam a student takes and the grade on the final exam? If there is a relationship, what is the relationship and how strong is it? The type of data described in the example is bivariate data — bi for two variables. In reality, statisticians use multivariate data, meaning many variables.
In this chapter, we’ll look into modeling data with linear regression to study relationships between two variables (bivariate data) and investigate the strength and direction of any apparent relationships between the variables. When we observe strong relationships between two variables, we’ll need to investigate causation vs. association. To establish cause and effect, recall that experiments/clinical trials are the appropriate way to follow.
LINEAR EQUATIONS
Equations of Lines
Do you recognize this? [latex]y=mx+b[/latex]. It is the equation of line in the slope intercept form that is commonly found in algebra textbooks. In statistics textbooks, though, an equivalent form [latex]y=a +bx[/latex] is more common. There’s no real difference between the two variations as long as we’re keeping track of what letters represent slope and the y-intercept. Read more about the difference here: LinReg(ax + b) versus LinReg(a + bx).
The x variable is called an independent variable. In statistics, we also call it the explanatory variable. The y variable is the dependent variable, also called the response variable.
Practice
The slope of a line measures the steepness of a line in relation to the horizontal. If you need a refresher on linear equations, please review this great lesson on slope from Khan Academy. They also have another fantastic lesson on interpreting slopes and y-intercepts.
Practice