Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

來自 Johns Hopkins University 的課程

Mathematical Biostatistics Boot Camp 2

45 個評分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

從本節課中

Hypothesis Testing

In this module, you'll get an introduction to hypothesis testing, a core concept in statistics. We'll cover hypothesis testing for basic one and two group settings as well as power. After you've watched the videos and tried the homework, take a stab at the quiz.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

So let's actually go through this calculation.

So, remember, beta is the type two error rate, so 1 minus beta is the power.

And let's assume n is large, so that we

can just do standard normal calculations rather than t calculations.

Okay, so here, notice our s is replaced by

sigma, the true value of the standard, the population

standard deviation.

So here we have our test statistic, which is

X-bar minus 30 over sigma over square root n.

Which, under the null hypothesis that mu equals 30, is a Z statistic so our

rejection region since we're rejecting whether or not mu is larger than 30.

Will be as if this mean is large, hence this,

if this normalized mean is larger than a standard normal quantile.

If we wanted alpha level

error rate we would, we grab the Z1 minus alpha level standard normal quantile.

So for example if we wanted a 5% error rate, we would pick the 1.645, the number

1.645, which is the 95th percentile of the standard normal distribution.

Then 1 minus beta, right, is

the probability that we reject, the probability

that the statistic is larger than the quan, the cutoff, the critical value.

Under the alternative hypothesis, given that mu, is in fact, mu a.

So, now this statistic is no longer a Z statistic,

because we're considering the alternative hypothesis, not the null hypothesis.

it, it's a normal of course, if, if the, if the data is Gaussian distributed.

And this is, of course, still normally distributed,

just with a different mean.

And if n is large, and we're applying the Central Limit

Theorem, then this is, again, not converging to a Z statistic.

So, what we need to do is convert it to a Z statistic.

So, the easiest thing to do would be to maybe add, subtract

the mean under the alternative, and we do that on this line here.

Then in the next line, we simply take the correctly, normalized mean, X bar minus

the mu under the alternative, which is what we are assuming to be true.

Divided by the standard air single, square root n, and now, we're calculating the

probability that, that is larger than Z1

minus alpha, minus this quantity over here.

mu a minus 30 over sigma over square root of N.

Now again, this standardized mean is a Z

statistic, because we're doing the calculation under the alternative.

So we want

the probability Z is larger than this quantity over here, which

we can perform this calculation, because we know Z1 minus alpha.

But we're assuming we know sigma.

We of course know n, and we know 30, of course.

so mu a is that only thing we have to plug in.

And that is the fact about power calculations,

is that you have to plug in the particular

value, the mean, that you're interested in.

Okay, so let's actually do a specific version of this calculation.

And suppose that we want to calculate the power of detecting an increase in the

mean RDI of at least two events per hour, above our null hypothesis of 30.

So we, we want to be, we want to calculate what's the

power if the, the alternative mean, the population av, the population

mean or Respiratory Disturbance Index is 32.

When our null hypothesis is that it's 30, and

we'd like to calculate the power of detecting that.

Now, again, under the assumption where the type one error rate is 5%.

So, again, assume normality and at the sample question

will have a standard deviation of four, events per hour.

What will be the power if we took a sample of size 16?

Okay, so, here are Z1 minus alpha is 1.645 are

mu a minus 30 over 4 over square root of 16, works out to be 2.

So, we want the probability that a standard normal is bigger than

1.645 minus 2, just the probability of standard normal is bigger

than negative 0.355 which is 64%. So, under these

set of assumptions the probability of detecting an alternative of

two events above the hypothesized value per hour is 64%.

And this is, of course, a bound if the, the power only

gets larger as the alternative goes away from 30 events per hour.

This is, this makes sense of course, right?

Because, the, the, the bigger the difference is from

the null, the easier it should be to detect, right?

If, if the true population mean is 100 events per hour, we

shouldn't, you know, we should have a high probability of detecting that.

A higher probability of detecting that than if the true mean is 30.01

events per hour which seems like it would be very hard to detect.

relative to 30,

because it's such a small change. so this power,

64%, is a bound for all values above 32. So

instead of calculating power given a sample

size a variance, and a value

of the alternative. We could flip the question

around and say, imagine we have a power that we'd

like to achieve for a particular value of the alternative.

What sample we, what sample size would we need to achieve it?

And this is called a sample size calculation.

It, in both of these calculations are typically

done at the phase of designing the study.

So when you actually want

to figure out how many subjects to have in the study.

Or whether or not to conduct the study, if your number of subjects is constrained.

so here we do this calculation exactly, where we calculate

the sample size we would need to get 80% power.

80% is a very, is a very common benchmark standard in the field of science.

You can argue whether 80% is enough power,

but it is somewhat of a benchmark in, in, say, for example, clinical trials.

I would admit though, most clinical trials do two-sided

tests, and here we're calculating power for a one-sided test.

so here we want 0.8 to be the probability that our test statistic.

Which appropriately normalized as a Z statistic under the alternative, is

greater than this the, the, the standard normal quanta Z 1 minus alpha.

But remember when we converted our test statistic

so that it was normalized appropriately under the alternative.

We have this extra term mu A minus 30

over the standard error, sigma over square root n.

And this calculation is then of course calculated under the alternative.

Which is when the, the, the, when we normalize with

respect to mu, mu a, and you get this extra term

out here.

So if we want this probability to be 80% then we know that the entity here

on the right, Z1 minus alpha minus mu a minus 30 over sigma over square root n.

We know it has to be.

It has to be equal to the

20th percentile from the standard normal distribution right?

So we have to set and so we know what Z1

minus alpha is, we know what mu a is, we obviously

know what 30 is and then sigma over square root N.

We know all of those except n and we can just solve for n.

and that is a so called sample size calculation.

And you know, logic would dictate, and of course the mathematics works out this way,

that if we solve this power calculation for a particular value of mu a.

It's going to

be applicable for every larger value of mu

a, because the direction of the alternative is larger.

We're not going to need a smaller or we're going to

need a smaller sample size to detect bigger effects.

so once we, we do this calculation for a specific value of mu a, it holds.

That sample size will give us 80% power or higher, for all larger

values of mu a.

And, so usually you pick mu a to

be the smallest effect that you could reasonably detect.

That you would reasonably want to detect.