MONT 106N -- Identifying Patterns
Regression Example -- October 30, 2009
The following data are the first and second exam scores from a
HC calculus class with 34 students some time in the past (confidentiality
concerns require that I do not say exactly when this was!)
(1) |
(2) |
The summary statistics for the two exams:
(3) |
(4) |
(5) |
(6) |
Here is the scatter plot with the first exam score for each of the 34
students on the x-axis and the second exam score on the y-axis ,
together with the SD-line:
Now we plot the averages of the second exam scores
for each possible score on the first exam:
As we expect, these averages tend to lie above the SD-line if x is less
than the average of the x's and below the SD-line if x is greater
than the average of the x's. (RECALL, this comes by thinking about
the shape of the football-like cloud and the fact that the SD-line goes
through the tips of the football.)
Next, we show the regression line (dashed), together with
the SD-line and the averages of y's for each x
The slope of the regression line is related to the correlation
coefficient and the slope of the SD-line as follows:
(7) |
(8) |
(9) |