Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

來自 Johns Hopkins University 的課程

Mathematical Biostatistics Boot Camp 2

41 個評分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

從本節課中

Two Binomials

In this module we'll be covering some methods for looking at two binomials. This includes the odds ratio, relative risk and risk difference. We'll discussing mostly confidence intervals in this module and will develop the delta method, the tool used to create these confidence intervals. After you've watched the videos and tried the homework, take a crack at the quiz!

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Okay.

So the, in the previous slide our assumptions depended on having

a large enough sample for the central limit theorem to be applicable.

we can actually do an exact binomial test.

Which, by exact I mean The calculation utilizes the binomial distribution

rather than the asymptotic distribution of the normalized sample proportion.

So in this case let's, the, the, the event of getting so we observed 11 people with

side effects In the sample, we're testing greater than,

that our sample portion is greater than something else.

So the, the probability of, of getting of getting evidence as or more

extreme than we obtained would be the property of getting more people than,

than, than we observed, which is 11. Side effects.

So this is the probability of X A, which is the count of the number

of the people with, with side effects for drug A, being bigger or equal to 11.

That's the sum from 11 to 20 of the binomial proportion.

And then, you, you might ask, where does the null hypothesis come in?

Where does the, the, the fact that we're, we're under the null hypothesis that.

P is P

nought is 0.1 well right here so it's 0.1 to the x, 0.9

which is 1 minus 0.1 to the 20 minus x so in other words,

this calculation the probability of getting more than 11 people with side

effects out of 20 is done under the null hypothesis that P nought is 10%.

So this is the probability of getting evidence as

or more extreme in favor of the alternative with

the probability being calculated under the null hypothesis.

So this P value, if you do this calculation the probability

of getting 11 or more out of 20 with a null

[INAUDIBLE]

with a probability of point 1, if you

do that calculation the probability is around Zero.

there, there's very little contribution of these the, these numbers are quite small.

And you can do this in R very easily.

Just pbinom 10 20, 0.1, lower.tail equals FALSE.

Now, I, I just want to point out this, this small little detail here.

If, if we did pbinom.

And didn't have this lower.tail equals false if we had lower.tail equals

true or omitted it because the default value is true then it's

going to calculate 10 the probability of 10 plus 9 plus 8 plus

7 plus 6 plus 5 plus 4 plus 3 plus 1 plus zero.

If we do pbinom lower.tail equals FALSE. In other

words, it wants to calculate the, the greater

than probability, it does the strictly greater than.

So if you do lower.tail equals TRUE, it

does less than or equal to, so includes 10.

If you do greater than, if you do

lower.tail equals FALSE, hence you want greater than.

It does, it does as strictly greater than, so it starts with 11.

If you put pbinom 10, 20, 0.1, lower.tail is equal to FALSE.

It's going to calculate the probably

of 11 plus 12 plus 13 plus 14 plus 15 plus 16 plus 17 plus 18 plus 19 plus 20, Okay?

So in other words, pbinom 10, 20, 0.1, lower.tail = true, that number.

Plus pbinom 10, 20, 0.1, lower.tail equals FALSE.

Those two numbers add up to 1.

The 10 is only included in the instance

where lower.tail gives true in the instance where lower.true

is false it starts it goes above 10 to 11 and, and, and higher.

Anyway just small point, but you get the wrong answer if you don't do that.

That.

And then if you, if you want to avoid this discussion, you

could just do binom.test to say, well we had 11 successes out

of 20 trials and we want to test the hypothesis that it's

0.1 and I want my alternative to be greater than binom.test does it.

Binom.test is maybe a little bit nicer to use because

it actually it, it actually does the,

gives you the exact confidence interval as well.

Okay.

So, so one of the reasons this, this test is called exact is that the.

Unlike the asymptotic error rates where the alpha that we used to

get the normal quantile is an approximate error rate for the test.

So if you do, if you perform a 5% level

asymptotic test the, the, the alpha level is not necessarily 5%.

It might, might,

and there's been work to show that in some cases it can be substantially higher

than 5%. on the other hand, this exact test.

This guarantees that the alpha level is.

If you, let's say you, you pick alpha equal to 0.05.

It guarantees that the alpha level is 5% or lower.

The problem that, problem being, or lower.

Is that its

exact but conservative. So it's, it's 5% or lower.

So in some cases with very small sample sizes the exact level

can be much less than the observed level than the desired level.

and then you know, so for two sided test, what, what

I'm going to suggest is calculate the two one sided p values.

It should be obvious which one is going to be the smaller one.

and,

and then double it and that just kind of follows our

rule we've been using in normal tests and this is good enough.

There are maybe slightly better procedures but

they change the numbers only a little bit.

So, given that we can do a two sided test

either by this way or maybe by a better ways.

we could calculate every value of P naught, let's say, by a grid search,

for which we would fail to reject a null hypothesis in our two-sided test,

and that would yield a confidence interval,

and that confidence interval would have an

exact coverage rate, so it would have coverage, if you did a 95%, 5% test.

It would

have coverage 95% or higher that's so all

these things are slightly conservative so it would

be 95% maybe much higher maybe 97% if it's, if it's a very small sample size.

And this interval's given a name it's called the Clopper/Pearson interval but

it, it the benefit of it is it gurantees your coverage rate.

You get coverage 95%.

Or better or, or higher. The problem

is, is that, in the event that it's or

higher, you've unnecessarily, potentially unnecessarily widened the interval, right?

because if you want higher coverage, you're generally going to get a

wider interval, so there's no such thing as a free lunch.

And this and exact intervals fall under that category just as well as everything

else in that they do guarantee your error rates but then they have this

tendency to be conservative.

Wider intervals being a little bit less likely to

reject a null hypothesis, are the consequences of using that.

On the other hand you do get the assurance that

the error rate is exactly adhered to given your assumptions.

I just wanted to show a picture from American status

[UNKNOWN]

paper that I was involved in based on earlier work by Agresti and

Brant Coull. And here they what we did is

we compared the coverage rate of the wald interval

versus the approximate wald interval obtained by

using the, inverting the score test and just simply

plugging in two. Rather than 1.96.

And what you see in these top lines is that, the, the, the,

the jagged, they're, they're jagged because

of the discreteness of the binomial distribution.

and, and here I, we show, let's, let's just look at this top row for 95%, OK?

what you can see is that this

solid line, the, the, the the aggrestic cool interval it hovers right around 95%.

Sometimes its a little low, sometimes its a

little high but it still hovers right around.

The wild interval can be quite a bit off.

Now this is a small sample size so there's no reason

to believe the asymptotics have kicked in and done very well.

But, you know, if you get up to a say a sample of size 20, the, the closer,

the true value of p is to zero and one. You could get a coverage that is very low.

[INAUDIBLE]

Here I think it was truncated at 70%, it dips down

and touches zero at zero and touches zero at one so.

The point being that, that, you know, switching away from

this Wald interval, where you put p, in, in, for the

confidence variable, where you put p hat in the standard error

calculation, to the Agresti pool interval, where its a simple fix.

really improves performance quite a bit.

At, at, no conceptual or computational cost.

So that's just the point I'm trying to, trying to make here,

and it relates to this discussion of, of, you know, it's basically saying

that the score intervals pro, performing a lot better than the Wald interval

in this case and that tends to be a gen, a general rule.