TypeII1.html

MATH 376 -- Probability and Statistics II

Type II Error Probabilities for a large-sample lower-tail Z-test

March 25, 2010

We consider testing for value of a proportion. Say the null

hypothesis is : p = .3 and the alternative hypothesis is

: p < .3. We will assume n is always large enough

so that our general discussions of the large-sample tests applies.

The test statistic is z = and the rejection region for

an -level test is RR = { z < - z_.05 } = { z < -1.645 } or {Y/n < .3 - 1.645 }.

Set-up:

> sigma[1]:=n->sqrt((.3)*(.7)/n);

(1)

The normal pdf under the assumption that is true:

> f[1]:=n->1/sqrt(2*Pi*sigma[1](n)^2)*exp(-(y-.3)^2/(2*sigma[1](n)^2));

proc (n) options operator, arrow; `/`(`*`(exp(`+`(`-`(`/`(`*`(`/`(1, 2), `*`(`^`(`+`(y, `-`(.3)), 2))), `*`(`^`(sigma[1](n), 2))))))), `*`(sqrt(`+`(`*`(2, `*`(Pi, `*`(`^`(sigma[1](n), 2)))))))) end pr... (2)

But now suppose that the actual value of p is .24. We ask: How likely are

we to make a Type II error with the = .05-level rejection region given above?

Here's the normal pdf that approximates the distribution with p = .24:

> sigma[2]:=n->sqrt((.24)*(.76)/n);

(3)

> f[2]:=n->1/sqrt(2*Pi*sigma[2](n)^2)*exp(-(y-.24)^2/(2*sigma[2](n)^2));

proc (n) options operator, arrow; `/`(`*`(exp(`+`(`-`(`/`(`*`(`/`(1, 2), `*`(`^`(`+`(y, `-`(.24)), 2))), `*`(`^`(sigma[2](n), 2))))))), `*`(sqrt(`+`(`*`(2, `*`(Pi, `*`(`^`(sigma[2](n), 2)))))))) end p... (4)

> RR:=n->.3-1.645*sigma[1](n);

(5)

The following plot shows the normal densities for the sampling distribution

of Y/100:

blue plot (shorter normal curve) = if is true

red plot (taller normal curve) = if is false and actual p = .24

The vertical black line is plotted at the value .2246 (the upper edge of the RR taking

n = 100.)

> RR(100);

(6)

> Dists:=plot([f[1](100),f[2](100)],y=0..0.5,color=[blue,red]):

> RRPlot:=plot([RR(100),t,t=0..10],color=black):

> with(plots):

> display(Dists,RRPlot);

Plot_2d

The area under the shorter normal curve to the left of the vertical line is the probability

of a Type I error:

> evalf(Int(f[1](100),y=-infinity..RR(100)));

(7)

This checks with the desired value = .05 The area under the taller normal curve to the

right of the vertical line is the probability of a Type II error (always assuming p = .24):

> evalf(Int(f[2](100),y=RR(100)..infinity));

(8)

This is certainly unacceptably large! What can we do to decrease it?

The answer is: Increase the sample size(!) Increasing n decreases the variances of both

distributions (so the pdf's get taller and skinnier), and moves the cut-off for the RR for the

.05-level test farther to the right. For instance the picture for n = 500 is:

> RR(500);

(9)

> Dists:=plot([f[1](500),f[2](500)],y=0..0.5,color=[blue,red]):

> RRPlot:=plot([RR(500),t,t=0..22],color=black):

> display(Dists,RRPlot);

Plot_2d

>

(10)

> evalf(Int(f[2](500),y=RR(500)..infinity));

(11)

So the Type II error probability has been reduced to .084 by increasing n to 500,

while = .05 still. In class we showed that can be reduced to any given value

by taking n sufficiently large.