A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

來自 Johns Hopkins University 的課程

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

256 個評分

Johns Hopkins University

256 個評分

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

從本節課中

Module 3A: Sampling Variability and Confidence Intervals

Understanding sampling variability is the key to defining the uncertainty in any given sample/samples based estimate from a single study. In this module, sampling variability is explicitly defined and explored through simulations. The resulting patterns from these simulations will give rise to a mathematical results that is the underpinning of all statistical interval estimation and inference: the central limit theorem. This result will used to create 95% confidence intervals for population means, proportions and rates from the results of a single random sample.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

Okay, let's do some practice exercises related to the material we've

covered, in this set of lecture sections in sec, lecture six.

So first let's look at how our simulation results compare

with what the Central Limit Theorem would predict them to be.

So we've learned about the CLT.

And what it tells us about the

theoretical sampling distribution of a sample statistic.

And we showed in sections b and c, the

results of some simulations to illustrate the CLT results.

Let's check the piece about the standard

error, or variability in our statistics being a

function of the population level variation, and

the sample size each statistic is based upon.

So let's recall the samples taken on the number

of kidney or urinary DRG discharges, from the hospital population.

It has a mean of 69.2 discharges, and a standard deviation of 58.4.

What I want you to compare is

how the observed standard errors for the simulations

based on 2,000 random samples, for each of the three sample sizes, 50, 250, and 400.

To compare to what is actually predicted by the CLT.

And just to recall, here are the results of

the estimated sampling distributions.

And so what I'm calling here, just to remind ourselves, even though we call

it standard error, standard error just measures

the variability in the set of numbers.

Where the numbers happen to be summary statistics across multiple random samples.

So what we have here, is the estimated standard

errors by simulation.

So, the estimated standard error for means based on samples of size 50, from this

population, is 8.1.

Okay. Now recall the simulated samples.

Let's do a binary example now.

Recall the simulated samples taken on Baltimore residents that they

were from a population in which the true proportion of residents

in poverty was 22.9%. And I wanted you to compare how the

observed standard errors from the simulations, based on 1000 random samples

for each of the sample sizes 50, 150 and 500.

Compare to what is predicted by the CLT.

And here are the results from our

estimated sampling distributions, just to refresh your memory.

So for example, we took 1,000 proportions, each based on a random sample size 50.

The observed variability in these 1,000 proportions was 0.58.

This again is we, the variability or

standard deviation of these 1,000 sample proportions.

But we generally call this, to distinguish it from variation in

individual values in the sample or population, we call this variation

in statistics, the estimated. In this case estimated by simulation.

Standard error estimated by simulation.

Okay.

So let's look at another example, weight change and diet type.

This is a data set we've looked at before.

A low, or the results from an article that we've looked at before.

A low carbohydrate as compared with a low fat diet in severe obesity.

And this is where 132 severely obese subjects

were randomized in one of two diet groups.

And the subjects were followed for a six-month period.

So what we're going to try and do with this

is estimate the characteristics of a

sampling distribution from a single sample.

And we've done this in lecture 16.

I want you to focus on the sample, these

were people who were randomized to a low-carb diet.

So this is our sample of persons randomized to the low-carb diet.

There's 64 people.

Their mean weight change, post-diet, less pre-diet, was negative 5.6 kilograms.

And the standard deviation of these 64 individual weight changes was 8.6.

So I'd like you to use the CLT.

What you know, the CLT tells us, and these sample results, to estimate

the characteristics for the sampling distributions for sample means of weight

change. From samples of size 64 from the low-carb

diet population. And then one more example.

The maternal infant HIV transmission that we looked

at in several places in the course thus far.

Let's focus our efforts on the placebo group

now, as opposed to the entire sample of data.

So of the 183 births to mothers in

the placebo group, 40 infants were HIV infected.

So using the CLT in

these sample results, I'd like you to

estimate the characteristics of the sampling distributions.

For sample proportions of children contracting HIV within 18 months of birth,

based on samples of size 183 HIV infected pregnant women.

Who were not treated with AZT or any other treatment.

So this kind of gives us a measure of the underlying baseline risk of transmission,

in HIV positive mothers who were not treated for HIV.

Now I'll give you a minute now to turn

off the video and go back and do these exercises.

And when you resume, we'll look at my take on the solutions.

Okay, welcome back. Hope you found these exercises useful.

So let's first look at this thing about comparing the observed variation in

sampling distributions that were simulated, by taking

multiple random samples from the same population.

To what we'd expect the variability to be, given the Central Limit Theorem.

So let's recall the results from our estimated sampling distributions.

Okay, we have, for sample sizes of 50

from this population of discharges, from this hospital population.

Our observed variability In the 2000 sample means we computed it, with 8.1.

8.1 discharges.

Now the Central Limit Theorem tells us that the theoretical standard error

of means

of size 50 from this population, will

be the population standard deviation in individual values.

The between hospital variation in discharge counts, divided

by the square root of our sample size.

Ordinarily we would have to estimate this from a single sample.

But in this simulation, we know how variable the observations

in the population are, because we've sampled from that population

multiple times. So I'm going to plug this in.

It was 58.4.

So if we divide that by the square root of 50, and you can check my math.

It turns out to be about 8.25 discharges. So our estimate, while in the same order

of range is slightly higher than what we observed in these 2,000 sample means.

But remember this 8.1 is just an estimate based

on 2,000 samples.

The Central Limit Theorem is giving us the standard error

amongst means from all possible random samples of size 50.

So these two look similar in value.

So in short, what we saw in our

simulation results was pretty close to what we predict.

Or what the Central Limit Theorem tells us it should be.

How about when we we're dealing with samples of size 250?

Well the same logic applies.

The estimated standard error, is still, formulaically pretty much the same, except

that we replace the denominator with our, new sample size of 250.

And if we do this out, we get 58.4 divided by the

square root of 250. And this gives about 3.7.

So here again, our observed estimated standard error is slightly less

than what the Central Limit Theorem tells us it should be.

But remember, this is is just an estimate of the true standard error

based on only 2,000 samples. So in all, these sync up pretty closely.

Finally, let's look at the results of samples from size 400.

And the drill is again, the same here.

The Central Limit Theorem tells us that the

standard error of means from this, based on

samples of 400 from this population, should equal

the true variation across the hospitals in discharges.

Divided by the square root of the sample size, square root of 400.

Here again we have,

true variation is 58.4.

Divided by the square root of 400 is approximately equal to 2.9.

So again, our estimate underestimates what the Central Limit Theorem

tells us it could be, but this is just an estimate.

Had we taken a different 2000 samples, and computed their sample

means, we might get something closer to 2.9 or above it,

just by chance.

Because we're only looking at 2,000, such

estimates from a, almost infinite, set of possibilities.

But on the whole, these two things look pretty similar.

So what we're seeing is the results from our simulation track with what

we'd expect 'em to be, more or less, based on the Central Limit Theorem.

So this is just trying to show you that this theorem has some validity.

By comparing what we actually observed

in simulations, to what we'd expect. Let's do this again.

Let's compare what we observe to what we'd expect to

get via the Central Limit Theorem with our binary outcomes.

And this is the simulated samples taken on Baltimore residents who were from

a population in which the true proportion of residents in poverty was 22.9%.

And I ask you how the standard errors from the simulations,

based on 1,000 random samples for each of the three

sample sizes, compared to what is predicted by the CLT.

Here are the results from our estimated

sampling distributions for samples of size 50.

We took 1,000 propor, samples of size 50, computed 1,000 sample proportions.

And this is the observed variability in those 1,000 sample proportions.

Now let's compare that with what we expect from the CLT.

The CLT says, look, if you're sampling from a

population of binary data, and you compute a sample proportion.

It should be equal to the square root of the true proportion times 1 minus the

true proportion, over the sample size of the sample that the estimate was based on.

Now,

generally we don't know the true proportion.

But in this simulation we do, so for samples of size 50,

we'd expect standard error to be 0.229 times

1 minus 0.229 over 50.

And I'll let you work this out and verify my math.

But it comes out to be 0.059 or 5.9%.

So very close to the actual variability we observed, in our estimated

sample distribution of proportions, based on 50 observations at a time.

Let's do the same thing, and see how it syncs

up for samples of 150 from this binary outcomes situation.

So we took 1,000 samples, 150 people each.

Variation in those 1,000 sample proportion estimates was 3.4%.

What would we get vis-a-vis the Central Limit Theorem?

What would it tell us?

Well, again, here we're lucky enough to know the population truths.

So this is not reality.

But the true standard error, based on the population

we're sampling from, would look like this.

And in this case, that actually equals 0.034.

So we have a perfect match up.

That won't always happen between the simulated results in what we pred, as we

saw before, they were slightly off with

the discharge this example, but very close.

Here they're actually the same.

So what we saw among these 1,000 estimates is exactly

what we would have predicted from the Central Limit Theorem.

Finally, let's do this for the samples of 500 each.

So the standard error of our sample proportions based

on 500, and I'll write this out more quickly.

because it's more of the same, would be 0.229,

1 minus 0.229 over 500. Which

equals approximately 0.019,

we saw 0.018.

These are probably closer than they look, because we've rounded both.

But pretty much off slightly, but pretty much the same thing.

So, again, what we observed in our simulation, syncs

up with what the Central Limit Theorem predicts for us.

So this hopefully starts to give you some proof that the Central

Limit Theorem works, in terms of what it's telling us about these characteristics

in sampling distribution between the shapes we've seen.

Where they're centered, and now that the formula gives

us for variability, syncs up with what we've observed.

Now let's get in the situations that we will have

in reality, where we only get to observe one sample.

We won't be able to do the simulations we've done.

We won't be able to verify or rectify the differences between

what the Central Limit Theorem would tell us, versus our simulations.

And the differences were minimal, as we saw.

We're only going to have one

chance to characterize the behind-the-scenes sampling

distribution, based on the results of a single sample of data.

So what I've asked you to do here, is look

at this sample of subjects who were given a low-carb diet.

And characterize the sampling in distribution for

mean estimates from samples of size 64.

From a theoretical population of subjects on the low-carb diet.

Okay.

So what do we know?

Well let's c, couple things we know in advance.

The CLT tells us, look, if you do this study over and over again, and kept

getting different subsets of persons from the population

under study who are put on the low-carb diet.

And you did a histogram of your sample mate, mean weight change estimates.

And you did this histogram. It would a, be well approximated by a

normal distribution. It would be centered at the true mean

weight change, amongst everybody in the population being put on a low-carb diet.

Centered at that truth, which we can't directly observe.

And the variation in this, well, the standard error of means.

Okay, based on 64 persons each,

would look like this.

It would be the true population variability.

And weight change amongst everyone given the low-carb diet.

Divided by the square root of the sample size, the sample we had was 64.

But again, we don't know this.

So what can we do?

Well we have an estimate of this from the original sample size 64 we got, of 8.6.

So we can estimate the standard error as an estimate.

And it turns out

to be, about 1.1 with rounding.

So, we fully characterize what could have happened across all random samples of

size 64 in terms of the distribution of a resulting sample mean estimate.

Using the results from the CLT and this estimate from our sample.

Pretty powerful stuff.

Let's look at one more example of characterizing the

sampling distribution using the results from a single sample.

Remember in the maternal-infant HIV transmission

study, of the 183 births to mothers

in the placebo group, those who were not treated by AZT or anything else.

40 infants were HIV infected after 18 months, or within 18 months of birth.

So I wanted to use the CLT in these sample results

to estimate the characteristics of the sampling distributions for sample

proportions of children contracting HIV, based on samples of size 183.

So the CLT again tells us, look if you were to do this study over and over again,

and get random samples of 183 HIV positive mothers, and not treat them with anything.

And look at the proportion whose children contracted HIV.

If you were

to look at the distribution of this proportion estimates across

multiple random samples of 183 women, and do a histogram.

You know, it would take on approximately normal shape.

Furthermore, the center of this histogram would be that, on

average these estimates, some would be above, some would be below.

But the true proportion, or true transmission proportion.

How variable would these estimates be around that truth?

Well the true standard error, would be something we can't directly compute,

because it's a function of this true proportion, and the sample size 183.

But we can estimate the standard error, based on our sample

results by plugging in our best guess for the true proportion.

Which is our estimate

of 22%. So this would equal.

Our estimated standard error

would be, or 0.031,

3.1%. Alright, so onward and upward

to the next section where we'll actually take

what we've done here and go to the next

level which means creating what's called a confidence interval

for the underlying unobservable truth, we're trying to estimate.