MATH 376 -- Probability and Statistics II

Lab Project 2 Solutions

March 5, 2004

>    read "/home/fac/little/public_html/ProbStat/MaplePackage/MSP.map";

184013160

A)  Confidence intervals for the mean

1)

>    CVals:=[164.,272,261,248,235,192,203,278,268,230,242,305,286,310,345,289,326,335,297,328,400,228,194,338,252];

CVals := [164., 272, 261, 248, 235, 192, 203, 278, 268, 230, 242, 305, 286, 310, 345, 289, 326, 335, 297, 328, 400, 228, 194, 338, 252]

>    nops(CVals);

25

>    Cbar:=Mean(CVals);

Cbar := 273.0400000

>    S:=StandardDeviation(CVals);

S := 56.17419336

>    S2:=S^2;

S2 := 3155.540000

2) t[.75e-1](10)   to 4 decimal places:

>    tval:=TCDF(10,1.5593);

tval := .9250075311

z[.75e-1]   to 4 decimal places:

>    NormalCDF(0,1,1.4396);

.9250097002

3)  Small sample 95% c.i.

>    tval:=TCDF(24,2.064);

tval := .9750051952

>    Cbar-2.064*S/sqrt(25),Cbar+2.064*S/sqrt(25);

249.8512930, 296.2287070

     Small sample 85% c.i.

>    tval:=TCDF(24,1.4872);

tval := .9250084236

>    Cbar-1.4872*S/sqrt(25),Cbar+1.4872*S/sqrt(25);

256.3315479, 289.7484521

     Large sample c.i.'s

>    MeanLSCI(CVals,.05);

251.0201208, 295.0598792

>    MeanLSCI(CVals,.15);

256.8670962, 289.2129038

So the small sample form gives a slightly larger interval than the large sample form

(expected -- t-distribution has more probability mass in tails than standard normal).  Also,

increasing the confidence level increases the size of the interval (including more cases

so to speak).

3)  85% confidence interval for sigma^2 .  We use the formula

     P( (n-1)*S^2 / chi^2 .075 <= sigma^2  <= (n-1)*S^2 / chi^2 .925) = .85

     The values chi^2 .075  and chi^2 .925  are found from the ChiSquareCDF

     function, using n-1 = 24   degrees of freedom:

>    ChiSquareCDF(24,14.8526);

.7500240679e-1

>    chisquare925:=14.8526;

chisquare925 := 14.8526

>    ChiSquareCDF(24,34.5723);

.9250004583

>    chisquare075:=34.5723;

chisquare075 := 34.5723

The confidence interval endpoints are:

>    24*S2/chisquare075,24*S2/chisquare925;

2190.567593, 5098.969877

This is a large interval, but that is to be expected from the size of   S^2   here.

B)  factors  a  and  b  for shortest (1 - alpha) x 100% confidence intervals for

variances ( n = number of samples )

>    SigmaCI:=proc(alpha,n)

>      local a,b,eq1,eq2,g,x,s;
  g:=x^((n-1)/2-1)*exp(-x/2)/(2^((n-1)/2)*GAMMA((n-1)/2));
  eq1:=int(g,x=a..b)=1-alpha;
  eq2:=evalf(a^(n/2)*exp(-a/2)=b^(n/2)*exp(-b/2));

>      s:=fsolve({eq1,eq2},{a,b},{a=0..n-1,b=n-1..infinity});
  return subs(s,[a,b]);

>      end:

>    sols:=SigmaCI(.05,10);

sols := [3.187436761, 22.91177635]

n=3  Take S^2 = 1   for comparison here and in all the following computations.

>    shortestab:=SigmaCI(.05,3);

shortestab := [.1014855529, 15.11133800]

>    best:=[2/shortestab[2],2/shortestab[1]];  best[2]-best[1];

best := [.1323509540, 19.70723855]

19.57488760

We use the values chi^2 .025  and chi^2 .975  from the book chi^2  table (n-1 = 2 degrees of freedom)

>    compromise:=[2/7.37776,2/.0506356]; compromise[2]-compromise[1];

compromise := [.2710849906, 39.49790266]

39.22681767

n=4

>    shortestab:=SigmaCI(.05,4);

shortestab := [.3448888076, 15.58938175]

>    best:=[3/shortestab[2],3/shortestab[1]]; best[2]-best[1];

best := [.1924386770, 8.698455657]

8.506016980

>    compromise:=[3/9.3484,3/.215795]; compromise[2]-compromise[1];

compromise := [.3209105301, 13.90208300]

13.58117247

n=5  

>    shortestab:=SigmaCI(.05,5);

shortest := [.6917773398, 16.57315374]

>    best:=[4/shortestab[2],4/shortestab[1]]; best[2]-best[1];

best := [.1745827097, 1.254926858]

1.080344148

>    compromise:=[4/11.1433,4/.484419]; compromise[2]-compromise[1];

compromise := [.3589600926, 8.257314432]

7.898354339

n=6  

>    shortestab:=SigmaCI(.05,6);

shortest := [1.109204177, 17.74344392]

>    best:=[5/shortestab[2],5/shortestab[1]]; best[2]-best[1];

best := [.2817942234, 4.507736360]

4.225942137

>    compromise:=[5/12.8325,5/.831211]; compromise[2] - compromise[1];

compromise := [.3896356906, 6.015319816]

5.625684125

n=7  

>    shortestab:=SigmaCI(.05,7);

shortest := [1.577643685, 18.99555017]

>    best:=[6/shortestab[2],6/shortestab[1]]; best[2]-best[1];

best := [.3158634494, 3.803140124]

3.487276675

>    compromise:=[6/14.4494,6/1.237347]; compromise[2] - compromise[1];

compromise := [.4152421554, 4.849084372]

4.433842217

n = 8

>    shortestab:=SigmaCI(.05,8);

shortest := [2.085047911, 20.28626614]

>    best:=[7/shortestab[2],7/shortestab[1]]; best[2]-best[1];

best := [.3450610355, 3.357237003]

3.012175968

>    compromise:=[7/16.0128,7/1.68987]; compromise[2] - compromise[1];

compromise := [.4371502798, 4.142330475]

3.705180195

n = 9

>    shortestab:=SigmaCI(.05,9);

shortest := [2.623486648, 21.59517998]

>    best:=[8/shortestab[2],8/shortestab[1]]; best[2]-best[1];

best := [.3704530366, 3.049377059]

2.678924022

>    compromise:=[8/17.5346,8/2.17973]; compromise[2] - compromise[1];

compromise := [.4562408039, 3.670179334]

3.213938530

n = 10

>    shortestab:=SigmaCI(.05,10);

shortest := [3.187436761, 22.91177635]

>    best:=[9/shortestab[2],9/shortestab[1]]; best[2]-best[1];

best := [.3928110969, 2.823585430]

2.430774333

>    compromise:=[9/19.0228,9/2.70039]; compromise[2] - compromise[1];

compromise := [.4731164708, 3.332851921]

2.859735450

We are onsistently doing better than the ``compromise'' intervals, with the a,b computed by

the procedure, but the difference decreases as   n   increases.

C)  Confidence intervals for ratio of variances of subpopulations (each normal, samples independent):

>    gp1:=[49.,108,110,82,93,114,134,114,96,52,101,114,120,116];

gp1 := [49., 108, 110, 82, 93, 114, 134, 114, 96, 52, 101, 114, 120, 116]

>    nops(gp1);

14

>    Xbar1:=Mean(gp1);

Xbar1 := 100.2142857

>    S12:=(StandardDeviation(gp1))^2;

S12 := 604.4890154

>    TCDF(13,2.1605);

.9750060028

>    tval1:=2.1605;

tval1 := 2.1605

>    Xbar1-tval1*sqrt(S12)/14,Xbar1+tval1*sqrt(S12)/14;

96.42008376, 104.0084876

>    gp2:=[133.,108,93,119,119,98,106,87,153,116,129,97,110,131];

gp2 := [133., 108, 93, 119, 119, 98, 106, 87, 153, 116, 129, 97, 110, 131]

>    Mean(gp2);

114.2142857

>    S22:=(StandardDeviation(gp2))^2;

S22 := 329.2582463

>   

We know   S[1]^2/(sigma[1]^2)/(S[2]^2/(sigma[2]^2))   has an  F(13,13)  distribution  (since it is the ratio of two

chi^2   random variables with 13 degrees of freedom each).  So the appropriate confidence

interval endpoints can be found as in the case of a single variance.  After some algebra,

the form is

              S[1]^2/(S[2]^2*f[alpha/2])       ,      S[1]^2/(S[2]^2*f[1-alpha/2]) '

where, by analogy with the z, t, chi^2   scores,  the   f[alpha/2]   is the value such that

an   F -distributed random variable satisfies  P( F  > f[alpha/2] ) = alpha/2    and similarly for the other.  

>    FCDF(13,13,.3211);

.2502286534e-1

>    f975:=.3211;

f975 := .3211

>    FCDF(13,13,3.1153);

.9750081492

>    f025:=3.1153;

f025 := 3.1153

The endpoints of the confidence interval are:

>    1/f025*S12/S22,1/f975*S12/S22;

.5893209463, 5.717569431

>    S12/S22;

1.835911544

The estimator value   S[1]^2/(S[2]^2)   is greater than 1, but the interval contains numbers < 1 too, so no conclusion

should be drawn about whether the variance in the birthweight is greater for mothers who do not

receive at least 5 prenatal care visits.