MATH 376 -- Probability and Statistics II
Lab Project 2 Solutions
March 5, 2004
> | read "/home/fac/little/public_html/ProbStat/MaplePackage/MSP.map"; |
A) Confidence intervals for the mean
1)
> | CVals:=[164.,272,261,248,235,192,203,278,268,230,242,305,286,310,345,289,326,335,297,328,400,228,194,338,252]; |
> | nops(CVals); |
> | Cbar:=Mean(CVals); |
> | S:=StandardDeviation(CVals); |
> | S2:=S^2; |
2) to 4 decimal places:
> | tval:=TCDF(10,1.5593); |
to 4 decimal places:
> | NormalCDF(0,1,1.4396); |
3) Small sample 95% c.i.
> | tval:=TCDF(24,2.064); |
> | Cbar-2.064*S/sqrt(25),Cbar+2.064*S/sqrt(25); |
Small sample 85% c.i.
> | tval:=TCDF(24,1.4872); |
> | Cbar-1.4872*S/sqrt(25),Cbar+1.4872*S/sqrt(25); |
Large sample c.i.'s
> | MeanLSCI(CVals,.05); |
> | MeanLSCI(CVals,.15); |
So the small sample form gives a slightly larger interval than the large sample form
(expected -- t-distribution has more probability mass in tails than standard normal). Also,
increasing the confidence level increases the size of the interval (including more cases
so to speak).
3) 85% confidence interval for . We use the formula
P( / .075 <= <= / .925) = .85
The values .075 and .925 are found from the ChiSquareCDF
function, using degrees of freedom:
> | ChiSquareCDF(24,14.8526); |
> | chisquare925:=14.8526; |
> | ChiSquareCDF(24,34.5723); |
> | chisquare075:=34.5723; |
The confidence interval endpoints are:
> | 24*S2/chisquare075,24*S2/chisquare925; |
This is a large interval, but that is to be expected from the size of here.
B) factors a and b for shortest (1 - alpha) x 100% confidence intervals for
variances ( n = number of samples )
> | SigmaCI:=proc(alpha,n) |
> | local a,b,eq1,eq2,g,x,s; g:=x^((n-1)/2-1)*exp(-x/2)/(2^((n-1)/2)*GAMMA((n-1)/2)); eq1:=int(g,x=a..b)=1-alpha; eq2:=evalf(a^(n/2)*exp(-a/2)=b^(n/2)*exp(-b/2)); |
> | s:=fsolve({eq1,eq2},{a,b},{a=0..n-1,b=n-1..infinity}); return subs(s,[a,b]); |
> | end: |
> | sols:=SigmaCI(.05,10); |
n=3 Take for comparison here and in all the following computations.
> | shortestab:=SigmaCI(.05,3); |
> | best:=[2/shortestab[2],2/shortestab[1]]; best[2]-best[1]; |
We use the values .025 and .975 from the book table (n-1 = 2 degrees of freedom)
> | compromise:=[2/7.37776,2/.0506356]; compromise[2]-compromise[1]; |
n=4
> | shortestab:=SigmaCI(.05,4); |
> | best:=[3/shortestab[2],3/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[3/9.3484,3/.215795]; compromise[2]-compromise[1]; |
n=5
> | shortestab:=SigmaCI(.05,5); |
> | best:=[4/shortestab[2],4/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[4/11.1433,4/.484419]; compromise[2]-compromise[1]; |
n=6
> | shortestab:=SigmaCI(.05,6); |
> | best:=[5/shortestab[2],5/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[5/12.8325,5/.831211]; compromise[2] - compromise[1]; |
n=7
> | shortestab:=SigmaCI(.05,7); |
> | best:=[6/shortestab[2],6/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[6/14.4494,6/1.237347]; compromise[2] - compromise[1]; |
n = 8
> | shortestab:=SigmaCI(.05,8); |
> | best:=[7/shortestab[2],7/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[7/16.0128,7/1.68987]; compromise[2] - compromise[1]; |
n = 9
> | shortestab:=SigmaCI(.05,9); |
> | best:=[8/shortestab[2],8/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[8/17.5346,8/2.17973]; compromise[2] - compromise[1]; |
n = 10
> | shortestab:=SigmaCI(.05,10); |
> | best:=[9/shortestab[2],9/shortestab[1]]; best[2]-best[1]; |
> | compromise:=[9/19.0228,9/2.70039]; compromise[2] - compromise[1]; |
We are onsistently doing better than the ``compromise'' intervals, with the a,b computed by
the procedure, but the difference decreases as n increases.
C) Confidence intervals for ratio of variances of subpopulations (each normal, samples independent):
> | gp1:=[49.,108,110,82,93,114,134,114,96,52,101,114,120,116]; |
> | nops(gp1); |
> | Xbar1:=Mean(gp1); |
> | S12:=(StandardDeviation(gp1))^2; |
> | TCDF(13,2.1605); |
> | tval1:=2.1605; |
> | Xbar1-tval1*sqrt(S12)/14,Xbar1+tval1*sqrt(S12)/14; |
> | gp2:=[133.,108,93,119,119,98,106,87,153,116,129,97,110,131]; |
> | Mean(gp2); |
> | S22:=(StandardDeviation(gp2))^2; |
> |
We know has an F(13,13) distribution (since it is the ratio of two
random variables with 13 degrees of freedom each). So the appropriate confidence
interval endpoints can be found as in the case of a single variance. After some algebra,
the form is
, '
where, by analogy with the z, t, scores, the is the value such that
an F -distributed random variable satisfies P( F > ) = and similarly for the other.
> | FCDF(13,13,.3211); |
> | f975:=.3211; |
> | FCDF(13,13,3.1153); |
> | f025:=3.1153; |
The endpoints of the confidence interval are:
> | 1/f025*S12/S22,1/f975*S12/S22; |
> | S12/S22; |
The estimator value is greater than 1, but the interval contains numbers < 1 too, so no conclusion
should be drawn about whether the variance in the birthweight is greater for mothers who do not
receive at least 5 prenatal care visits.