Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by 约翰霍普金斯大学

Mathematical Biostatistics Boot Camp 2

41 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Two Binomials

In this module we'll be covering some methods for looking at two binomials. This includes the odds ratio, relative risk and risk difference. We'll discussing mostly confidence intervals in this module and will develop the delta method, the tool used to create these confidence intervals. After you've watched the videos and tried the homework, take a crack at the quiz!

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Of thumb.

Okay, so now, let's actually get to comparing

two proportions rather than simply looking at one proportion.

So we want to test whether the side effects is

the same in the two groups or, or different.

so imagine if a is some new formulation and b is the standard and

you want to test whether or not the new formulation has, has more side effects

than the standard.

so in general for two by two tables I'm going to use the following notation.

I'm going to you know, have x.

n1 minus X, and n1 plus Y, and, n2 minus Y and n2 plus, and then, if I need to, I'l

refer to the four cells, indexing them by their matrix coordinates, n11, n12,

n21, n22. I'll call n1 the, the right margin, n2.

The right n1 the right top margin into the right bottom margin

but if, in, in, in the case that I'm referring to both

margins, I'll say n1 plus, n2 plus, n plus 1, and n plus 2, for

the, for the respective margins, in other words, just summing.

The notation meaning summing over that index.

Okay.

So now, let's do a, a score test type, test of a hypothesis that p1 equals p2.

So our null hypothesis is h not p1 equals

p2 versus not equal to, greater than or less than.

and then score test for this null hypothesis

are, are, are numerator is p1 minus p1

[INAUDIBLE]

minus p2

[INAUDIBLE].

The sample proportion in group 1 minus the sample proportion in group 2.

And then if, if we were assuming that this difference was a constant other than

0, we will put that in the numerator

here the null hypothesis difference but it typically.

The null hypothesis is that they're equal.

So there's minus 0 here to hypothesize null value

of the difference so we can just omit that.

And then in the denominator, the the, under the hypothesis

that p1 equals p2, then the stand, the variance of p1

hat minus p2 hat. Is p times 1 minus p, quantity times 1

over n, 1 plus 1 over n2, where p is the common proportion p1 equal to p2.

So, under the null hypothesis, we need an estimated version of that

if we're going to actually get a number here that we can use compared

to a normal quantile. So we need a value of p

to plug in there. So we say plug in p hat if under the null

hypothesis the sample proportions are identical then group A

is a bunch of IID draws IAD Bernoulli draws from group 1.

Group B is a bunch of IID Bernoulli draws. From group 2,

but they have the same proportions so we really just

have n1 plus n2 Bernoulli draws and our estimate of the

proportion would simply be the total number of events, so

that p hat is X plus Y over n1 plus n2.

And that is exactly the

[UNKNOWN]

for p.

The common proportion under the null

hypothesis that the due proportions are equal.

So, we plug that into the denominator p hat times 1 minus p hat, and then we

get, our test statistic which is just estimate

minus hypothesis, hypothesize value divided by the standard air.

And then this statistic is normally

distributed under the null hypothesis for large

n, and standard normally distributed under the null hypothesis for large n1 and n2.

So if we want to invert this to create a confidence interval, well we don't have

a closed form like we do in the score task for a single proportion.

the walled interval is p1 hat minus p2 hat.

And then it, it doesn't utilize

the fact that, under the null hypothesis, the proportions are equal.

So then you just have a separate p1 hat, 1 minus p1 hat.

Or m1 plus p2 hat. 1 minus p2 hat over n2.

In the denominator, square root the whole thing.

and you can of course invert that to get a, a confidence interval.

P1 hat minus p2 hat plus or minus Z1 minus

alpha over 2 times the square root of the standard error.

by the way do you see why you can't invert the, the score test?

The reason being.

That if you change the, the, the denominator was explicitly

calculated under the specific null hypothesis that p1 equals p2.

Here in this test statistic, if we were to have a different null, that

p1 minus p2 wasn't just equal to 0 but was equal to some other value.

We could, we, we would add that into

the numerator, and the, the denominator wouldn't change.

Whereas, in our score test we wouldn't have any, anyway to

adapt that denominator and that's there are no immediate way to

adapt, adapt the denominator and that's why you have to use

some, some programming to get the competent interval from that one.

But this one, the wald test, we can invert very easily

and we get an interval that should be fairly familiar to us.

P1 hat minus p2 hat plus or minus the

normal quantile times the square root of the standard error.

That's the, the so called wald interval, it's very easy.

To calculate and its taught in nearly every statistics text book.

So it, it, this, this performs poorly. This Wald interval

performs poorly and its relative to this score interval in, in test.

The Wald test and

the Wald interval perform relatively poorly.

But, but they're, they're decrease in performances

less so in the one sample case.

In the one sample case there is a huge decrease in performance but, but the

subtraction in the 2 proportions you know,

subtracting two things tends to make them more

normally distributed so it helps a little

bit and the, the decrease in performance Wald

interval so is it any where near as that as it is in the single proportion

[INAUDIBLE].

Case.

U, so for testing I would just say always use the score test, that's easy.

For intervals, inverting the score test is hard

and it's not in standard software, so our

simple fix that we propose in, in an

American statistician paper is to add one success and.

And, and one failure in each group. So calculate p1 tilde

which is x1 plus x plus 1 over n1 plus 2, n1 tilde which is n1 plus 2,

p2 tilde which is y plus 1 over n2 plus 2 and n2 tilde which is n2 plus 2.

So, this is exactly taking this two by two table.

that has the successes and failures for each group and adding one to every cell.

That's exactly what this is.

And then just treat that as if it's the data and construct a Wald interval.

And this interval it doesn't approximate the score interval

like the, in the, in the, in the Agresti-Coull Interval.

but it does perform better than the Wald interval and

I'll have a slide in a second to show you this.

Okay so let's just perform the test the score test, test whether

or not the proportion of side effects is the same for the two drugs.

Pa had 0.55 pb hat is 5 over 20 which is 0.25.

p hat, the common proportion, is 16 over 4,011 plus 5 over 20 plus 20, which

is 0.4, so our test statistic is 0.55 minus 0.25 over 0.4 times 0.6

times square root 2 over 20, sq-, I'm sorry.

Square root the whole thing.

You, anyway.

You can plug in the formula. You get 1.61.

And then we fail to reject h, not at the 5% level.

In other words, you compare it with 1.96 for a 2 sided test.

The two sided p value calculate the probability that a standard,

the absolute value of a standard normal is bigger than 1.61.

Which is that the positive part of a normal is

bigger than 1.61 plus the probability that the negative part of a normal is below

negative 1.61. that's I guess 0.055 in either tail.

So we fail to reject, there's our p value.

and so hopefully everyone can do this calculation

very easily at this point in the class.

Okay.

So, here is the same picture as before where, in the

previous picture I showed the true value of the proportion by the

coverage rate of the interval, for the single proportion.

Now here's there's two proportions, p1 and p2.

So here by the true value of p1 and p2, here's

the coverage probability on the left, I have the Wald interval.

On the right I have this Agresti-Caffo interval where

you add one to one success to one failure

to each group, one to every cell in the two by two table.

And you can see that we get these big

kind of dips down toward 0 on the Wald interval.

If, if either of the proportions is, is, is

if either of the proportions is either very low or

very high you get very bad performance and you get

you know, performance that's well below 0.95 and this shrinkage

towards 0.5 for each of the means for each of the proportions you

know, improves things dramatically and it's a very easy thing to do.

And then here's a simple another exact same, same plot.

just some cross sections through it of different sorts.

In the top ones I have where p1 minus p2 equals particular values and

then on the bottom one I have ones where ratios of p1 and p2 are fixed.

In other words, it's just sort of slices maybe not slices or

curves through that, that two dimensional picture and it again it just

shows that in a, in a nice easy 2D plot what the

Relative performance of the Agreti-Caffo interval is relative to the Wald interval.

Okay, let's briefly go over some likelihood plots

and, and Bayesian analysis of two binomial proportions.

So, likelihood analysis requires the use of profile likelihoods or some other

technique to reduce the dimension down, if you want to do a 1D likelihood plot.

and we can actually show you later on away

you can use the so-called non-central hyper geometric distribution

to get an exact likelihood plot for the odds ratio.

But for the difference in the proportions it's a little harder.

Probably doing a profile likelihood would be the way to go.

So is a little hard, so let's, let's.

leave that discussion for, for elsewhere.

So, instead let's talk about being a Bayesian.

So imagine, so we talked about, for a single binomial proportion, butting

a beta prior on a, on a probability to get a posterior.

So so

imagine putting an independent beta alpha 1 beta 1

prior, and an inde, and a beta alpha 2 beta 2 prior.

p1 and p2 respectively, then the posterior so remember how the

calculation goes. You take likelihood times prior equals

posterior. so here the likelihood is p1

to the x1, 1 minus p1 to the n1 minus x1.

P2 to the y2, 1 minus p2 to the n2 minus y2 and

then the beta prior is p2 to the alpha 1 minus 1, 1

minus p1 to the beta 1 minus 1, p2 to the alpha 2

to the minus 1, 1 minus p2 to the beta 2 minus 1.

So if we multiply all those together we get this formula right here.

Which exactly shows that if we have two independent

binomials and then we multiply them by two independent betas, we

wind up with an independent a pair of independent Beta posteriors.

One Beta posterior for p1, one Beta posterior for p2 where now the

Beta parameter is no longer alpha one but alpha one plus x1 for p1.

And the beta parameter for, for p1 is n1 plus beta 1.

The, the alpha parameter

for p2 is y plus alpha 2. And the beta parameter for p2 is 1 minus.

Is, n2 plus beta 2.

So it's basically like, alpha and. Alpha 1 and beta 1 are the.

the, the, the beta, alpha and beta

parameters for p1, a priori, after you factor

in the data, the just, you add the

successes to alpha and the failures to beta,

you add, and, and, the same for, for p2,

and then you get the, the, the beta posteriors.

And the easiest way to explore this posterior is

with Monte Carlo simulation and I'll show that right here.

So it's, it's very simple. So here, I, I define my x, my n1, my, my,

alpha 1, my my beta 1, my y, my n2, my alpha 2 and beta 2.

And here I just did a uniform pr-, prior, so, so if

I have a beta with a 1 and a 1, that's just uniform.

So I put a uniform on both p1 and p2.

Then I'm going to sample from the posterior.

So, I just simulate random data as

a simulated a thousand data pairs, we're now in my

alpha parameters x plus alpha 1, n minus x plus

beta 1 and then for p2 my alpha parameter is

y plus alpha 2 and n minus y plus beta 2.

So, imagine if I want to look at the risk difference.

Read here the risk of side effects. P2 minus P1 is the parameter I want and it

does so, here p1 is, is a bunch of, of, posterior p1 simulations.

P2 is bunch of posterior p2 simulations.

If I subtract them, r does it component by component,

so I get a collection of a thousand risk differences.

I could plot the density of the risk differences in the next line.

I could calculate the, the lower 25th and the upper 97.5th quantiles of

these simulations to get Bayesian credible interval for them.

I could calculate the posterior mean and

I could calculate the posterior median. And

In the, in the, in the next side you see exactly this.

I, I have some r-code called twoBinomPost, which

I'll, which is on the get hub repository.

But also will be on the I'll put on the course website.

it, it puts out the mean.

The median for those three, the mode

for those three, and equi-tail confidence intervals.

Well, what I mean by equi-tail confidence intervals, I mean

it's 25% in the lower tail, in, in, in, 90.

7.25, 2.5% in the lower tail and 2.5% in the upper tail which I think we discussed

on the on the for the one sample binomial case we discussed that maybe its better

not to do equi-tail confidence intervals but or credible intervals but in

this case its easy enough to do it that way so why don't we just do it that way.

and you know, go through the twoBinomPost code.

It's very simple to do this.

And here what I'm showing is the posterior for the risk difference.

And this is what's nice about Bayesian intervals.

So here we're simulating p1 and p2 a posteriori.

So we're getting the posterior joint, draws from

the joint posterior distribution of p1 and p2.

Any function of p1 and p2 that you then want to investigate, it becomes very easy to do.

Any function of p1 and p2 that you

then want to investigate, it becomes very easy to do.

And so here I took the risk difference and plotted

the density.

I put some blue lines where the credible interval occurs, and I bel, The red

line is, is identically at 0, and so you can see that 0 does fly

within the credible interval which can also

be seen with the posterior, where there are

kind of more, what points are better

supported by the data for the risk difference.

And even though 0 is in our credible interval, you know.

it's not a, a terribly well supported value in the, value in the data.

I should say, it's not a terribly well supported value, a posteriori.

Well, that's the end of the lecture.

That was a whirlwind tour of, of,

risk different style intervals for 2 binomial proportions.

I'm hoping at this point that a lot of these topics in the class will start to

come very easily to you, because we're just kind

of using the same techniques over and over again.

And I look forward to seeing you for the next lecture.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.