MATH 376 -- Probability and Statistics 2 

April 12, 2010 

 

First Least Squares Regression Example 

 

> with(Statistics):
 

For computation of the estimators for the coefficients in the model 

 

  Y = `+`(beta[0], `*`(beta[1], `*`(x)), epsilon) 

 

We enter the lists of x  and  y  coordinates of the data points separately: 

> XList:=[0.,1,2,3,4,5,6,7];
 

[0., 1, 2, 3, 4, 5, 6, 7] (1)
 

> YList:=[34.3,33.5,35.9,39.3,44.8,48.8,55.7,62.9];
 

[34.3, 33.5, 35.9, 39.3, 44.8, 48.8, 55.7, 62.9] (2)
 

Means: 

> Xbar:=Mean(XList);
 

3.500000000 (3)
 

> Ybar:=Mean(YList);
 

44.40000000 (4)
 

Organizing the computation as we described in class: 

> SXY:=add((XList[i]-Xbar)*(YList[i]-Ybar),i=1..8);
 

177.7000000 (5)
 

> SXX:=add((XList[i]-Xbar)*(XList[i]-Xbar),i=1..8);
 

42.00000000 (6)
 

> hatbeta[1]:=SXY/SXX;
 

4.230952381 (7)
 

> hatbeta[0]:=Ybar-hatbeta[1]*Xbar;
 

29.59166667 (8)
 

Plot the data points together with the line determined by the least-squares estimators 

for beta[0]  and beta[1] 

> with(plots):
 

> Pts:=[seq([XList[i],YList[i]],i=1..8)];
 

[[0., 34.3], [1, 33.5], [2, 35.9], [3, 39.3], [4, 44.8], [5, 48.8], [6, 55.7], [7, 62.9]] (9)
 

> PP:=plot(Pts,style=point,color=blue,symbol=circle):
 

> LP:=plot(hatbeta[0]+hatbeta[1]*x,x=0..8):
 

> display(PP,LP);
 

Plot_2d
 

>
 

This indicates an OK (but not ideal) "fit" with the linear model.   

Other functional forms might also be considered, since  

it seems that the y's are consistently low in the middle of the range  

of x's and consistently high on the ends.  A truly good fit with  

a model of our form will have points ``randomly'' high and low 

with respect to the regression line.