PS2Lab2.html

MATH 376 -- Probability and Statistics II

Lab Project 2 Solutions

March 5, 2004

> read "/home/fac/little/public_html/ProbStat/MaplePackage/MSP.map";

A) Confidence intervals for the mean

1)

> CVals:=[164.,272,261,248,235,192,203,278,268,230,242,305,286,310,345,289,326,335,297,328,400,228,194,338,252];

> nops(CVals);

> Cbar:=Mean(CVals);

> S:=StandardDeviation(CVals);

> S2:=S^2;

2) to 4 decimal places:

> tval:=TCDF(10,1.5593);

to 4 decimal places:

> NormalCDF(0,1,1.4396);

3) Small sample 95% c.i.

> tval:=TCDF(24,2.064);

> Cbar-2.064*S/sqrt(25),Cbar+2.064*S/sqrt(25);

Small sample 85% c.i.

> tval:=TCDF(24,1.4872);

> Cbar-1.4872*S/sqrt(25),Cbar+1.4872*S/sqrt(25);

Large sample c.i.'s

> MeanLSCI(CVals,.05);

> MeanLSCI(CVals,.15);

So the small sample form gives a slightly larger interval than the large sample form

(expected -- t-distribution has more probability mass in tails than standard normal). Also,

increasing the confidence level increases the size of the interval (including more cases

so to speak).

3) 85% confidence interval for . We use the formula

P( / .075 <= <= / .925) = .85

The values .075 and .925 are found from the ChiSquareCDF

function, using degrees of freedom:

> ChiSquareCDF(24,14.8526);

> chisquare925:=14.8526;

> ChiSquareCDF(24,34.5723);

> chisquare075:=34.5723;

The confidence interval endpoints are:

> 24*S2/chisquare075,24*S2/chisquare925;

This is a large interval, but that is to be expected from the size of here.

B) factors a and b for shortest (1 - alpha) x 100% confidence intervals for

variances ( n = number of samples )

> SigmaCI:=proc(alpha,n)

>      local a,b,eq1,eq2,g,x,s;
  g:=x^((n-1)/2-1)*exp(-x/2)/(2^((n-1)/2)*GAMMA((n-1)/2));
  eq1:=int(g,x=a..b)=1-alpha;
  eq2:=evalf(a^(n/2)*exp(-a/2)=b^(n/2)*exp(-b/2));

> s:=fsolve({eq1,eq2},{a,b},{a=0..n-1,b=n-1..infinity});
return subs(s,[a,b]);

> end:

> sols:=SigmaCI(.05,10);

n=3 Take for comparison here and in all the following computations.

> shortestab:=SigmaCI(.05,3);

> best:=[2/shortestab[2],2/shortestab[1]]; best[2]-best[1];

We use the values .025 and .975 from the book table (n-1 = 2 degrees of freedom)

> compromise:=[2/7.37776,2/.0506356]; compromise[2]-compromise[1];

n=4

> shortestab:=SigmaCI(.05,4);

> best:=[3/shortestab[2],3/shortestab[1]]; best[2]-best[1];

> compromise:=[3/9.3484,3/.215795]; compromise[2]-compromise[1];

n=5

> shortestab:=SigmaCI(.05,5);

> best:=[4/shortestab[2],4/shortestab[1]]; best[2]-best[1];

> compromise:=[4/11.1433,4/.484419]; compromise[2]-compromise[1];

n=6

> shortestab:=SigmaCI(.05,6);

> best:=[5/shortestab[2],5/shortestab[1]]; best[2]-best[1];

> compromise:=[5/12.8325,5/.831211]; compromise[2] - compromise[1];

n=7

> shortestab:=SigmaCI(.05,7);

> best:=[6/shortestab[2],6/shortestab[1]]; best[2]-best[1];

> compromise:=[6/14.4494,6/1.237347]; compromise[2] - compromise[1];

n = 8

> shortestab:=SigmaCI(.05,8);

> best:=[7/shortestab[2],7/shortestab[1]]; best[2]-best[1];

> compromise:=[7/16.0128,7/1.68987]; compromise[2] - compromise[1];

n = 9

> shortestab:=SigmaCI(.05,9);

> best:=[8/shortestab[2],8/shortestab[1]]; best[2]-best[1];

> compromise:=[8/17.5346,8/2.17973]; compromise[2] - compromise[1];

n = 10

> shortestab:=SigmaCI(.05,10);

> best:=[9/shortestab[2],9/shortestab[1]]; best[2]-best[1];

> compromise:=[9/19.0228,9/2.70039]; compromise[2] - compromise[1];

We are onsistently doing better than the ``compromise'' intervals, with the a,b computed by

the procedure, but the difference decreases as n increases.

C) Confidence intervals for ratio of variances of subpopulations (each normal, samples independent):

> gp1:=[49.,108,110,82,93,114,134,114,96,52,101,114,120,116];

> nops(gp1);

> Xbar1:=Mean(gp1);

> S12:=(StandardDeviation(gp1))^2;

> TCDF(13,2.1605);

> tval1:=2.1605;

> Xbar1-tval1*sqrt(S12)/14,Xbar1+tval1*sqrt(S12)/14;

> gp2:=[133.,108,93,119,119,98,106,87,153,116,129,97,110,131];

> Mean(gp2);

> S22:=(StandardDeviation(gp2))^2;

>

We know has an F(13,13) distribution (since it is the ratio of two

random variables with 13 degrees of freedom each). So the appropriate confidence

interval endpoints can be found as in the case of a single variance. After some algebra,

the form is

, '

where, by analogy with the z, t, scores, the is the value such that

an F -distributed random variable satisfies P( F > ) = and similarly for the other.

> FCDF(13,13,.3211);

> f975:=.3211;

> FCDF(13,13,3.1153);

> f025:=3.1153;

The endpoints of the confidence interval are:

> 1/f025*S12/S22,1/f975*S12/S22;

> S12/S22;

The estimator value is greater than 1, but the interval contains numbers < 1 too, so no conclusion

should be drawn about whether the variance in the birthweight is greater for mothers who do not

receive at least 5 prenatal care visits.