Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

來自 Johns Hopkins University 的課程

Mathematical Biostatistics Boot Camp 2

41 個評分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

從本節課中

Discrete Data Settings

In this module, we'll discuss testing in discrete data settings. This includes the famous Fisher's exact test, as well as the many forms of tests for contingency table data. You'll learn the famous observed minus expected squared over the expected formula, that is broadly applicable.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

[SOUND]

Okay, so now let's switch gears and talk about goodness of fit testing.

Now, this is the one instance I'm going to talk about where the degrees

of freedom for the Chi-squared is not rows minus 1 times columns minus 1.

Okay.

So imagine if you wanted to test R's uniform random number generators.

So remember, a uniform is a number that's continuous number that's between zero and

one, that's in essense, kind of each, every value is equally likely.

What that means on the continuum scale is that the density is

a brick that starts at zero, ends at one, is hype one.

So, R simulates uniform numbers.

And why don't we you know, test it

out to see, using the Chi-squared distribution to see

whether or not we're getting kind of what we would expect.

So, I simulated 1,000 random uniforms.

And how many uniforms would I expect to see between zero and 0.25?

Between 0.25 and 0.5.

Between 0.5 and 0.75 and 0.75 and 1. Well, because it's uniform, it says that

about 25% of them should be from there. eh, eh, because it's uniform,

it says that 25% should be between 0.25 and 0.5 and so on.

Okay.

So what we could do is simulate these thousand, we have 254, 235, 267, 244, and

we could calculate the the you know, how many we would expect to see under the The

true probability density function that, or the assumed probability density function.

So here, our null hypothesis is that this

first probability is exactly 0.25, the second probability

is exactly 0.25, the third probability is exactly

0.25, and the fourth probability is exact, exactly

0.25.

And the alternative is that any one of

these probabilities is different from its hypothesized value.

Okay?

And so that is, that would be our tests, and that would be a

[UNKNOWN],

comparing the observe counts, the expected counts.

In this case would be a, you know, not unreasonable

way to check R's random number gen, random number generator.

Okay, so we observed 254

[SOUND]

from the overserve, from the in reality when we

simulated it the expected would be a 1,000 times 0.25.

Which is 250, and you can carry through the calculations.

They're all 250 in this case, of course.

And so are observe minus expected squared over

expected sum works out to be about 2.3.

This is 3 degrees of freedom.

The P value is 0.52.

And it shouldn't surprise you that we get a large P value

in this case because we're checking

whether R's uniform random number generator.

Which we know is a probability of

pretty good u, u, uniform random number generator.

Whether or not it, it generates reasonably uniform

numbers where we've broken the intervals up into 4.

You know, the, the test applies regardless of how we break up the intervals.

We could have made some of them small, some of them large, and so on.

and there's maybe, you know, you could pick

what's the most appropriate way to break up the intervals, but either way, this

you know, the, the, the logic behind doing this goodness of fit testing applies.

I would note that the degrees of freedom is, the,

Is the, just the number of, of cells minus 1.

And the reason, in this case is because we don't, you know,

we're not estimating anything, under the

null hypothesis we're actually specifying the probabilities.

So we don't

that, that's what changes the, that's what changes the degrees of freedom.

so, so let me also comment a little about testing random number generator.

you know, if, if, if you were going to do this you, you know,

you'd be concerned, you know, you would

want to generate thousands and thousands of variables.

You'd probably want very fine intervals that you were looking at.

and then the other thing is that This is

only testing one aspect of the random number generation.

It's only testing the uniformity, how uniform on the range it is.

But there's other aspects of randomness that, that

you would like to test, like whether or not.

You can detect whether the next value is it's,

it's somehow related to the previous, or something like that.

It's like some sort of auto-correlation or, or, or runs behavior.

and, and, and those sorts of things are what people do to

stress random number generators, which by

the way, are actually perfectly deterministic,

there's no randomness built into the computer.

Isn't, you know, picking off the clock,

or something like that, some, somehow generating randomness.

It's a deterministic sequence of numbers.

It's, they do those deterministic sequence of numbers

in such a way that it's, that it's very

hard to develop a statistical test that will

tell you that it's not exactly uniform in IID.

There's a whole discipline to this.

It's quite interesting.

let's see.

the other thing I would say about random number, number generators.

I can't do this obviously in this class.

but if you were to ask students in a class to generate random uniforms or random

integers, and you could of course you know,

create a goodness of fit test to evaluate it.

it's unbelievable how quickly you can determine that your people

aren't good at generating random numbers, in the sense of being equally distributed.

People can't distribute, you know, just ask a room full of

people to write out digits between one and ten, collect' em up.

And you will be able to diagnose immediately that that was not generated.

uniformly and IID from the digits one to ten.

It is an interesting experiment. You might want to try it sometime.

One of the easiest

ways to do it is that people don't give enough runs of specific numbers.

So, you know.

If you collect enough random digits, there should be one

to ten, there should be some instances of three consecutive 7s.

But if you ask people to write out a bunch of random

digits, they'll never write out three consecutive 7s, for whatever the reason.

Just goes to show, our intuition about what randomness is is not very accurate.

We, we have to have the mathematics,

because our intuition's flawed. Any way, let's move on to another

example of goodness of fit testing.

Okay, here's a famous example of goodness of fit testing.

so in, in, Mendel's P plant

experiment if you do out the Punnett square you,

you get under the what is it called? The Law of Independent Assortment.

You should get about 75% of the of the Yellow and, and 25% of the Green plants.

and so here's the observed data 6,022, 2,001, total 8,023.

And here is the expected data. how would you get the expected data?

Well, the true positive, not true positive.

the, true probabilities hypothesized true probabilities, let

me put it to you that way.

if the independent assortment laws were correct, you would get 0.75 and 0.25.

So you're multiplying 0.75 times 8,023 you

get 6,017.25, 0.25 times 8,023 you get 2,005.7

[UNKNOWN]

0.75.

S, using the observed minus expected squared over expected statistic, you get a

A very small Chi squared statistic. And

co, continuing on to the next slide.

you know, this is a 1 degree of freedom test because it's,

you know, cells minus 1, 2 minus 1, 1 degree of freedom.

The P value works out to be 0.90.

In other words, the probabilities fit the hypothesis, the, expected,

the observed cell counts fit the expected probabilities quite well.

and any rate there, there was a, there's an interesting kind

of famous discussion that occurred over this is that the, the,

the well know statistician Fisher combined several of Mendel's tables, used

the fact the sums of independent chi squareds should be chi squared.

He got his, you know, with the degrees of freedom adding up that he got a, You

know an associated P value of, you know, perhaps exactly,

almost exactly one. And his accusation was then that maybe

this the data fit the Mendel's hospital ha, ha, hypothesized

probabilities to, too well, that maybe something fishy was going on.

So I don't know, you can, you can read

about that It's kind of an interesting little study.

Okay, so for the final slide on

goodness of fit testing, let's just summarize.

it tests whether not observed counts are

consistent with theoretical values, theoretical hypothesized probabilities.

The test statistic remains the same chi-squared test statistic.

It follows a chi-squared distribution minus 1, the degrees

of freedom are the number of cells minus 1

[UNKNOWN]

this is especially useful for testing random number.

Generators I, I, I, the you know,

this is all for discrete discrete data.

We had to discretize the uniform distribution to

use the chi squared goodness of fit test.

There's a test called Kolmogorov.

sm, Smirnoff test, that's an alternative that doesn't require discretization.

you know?

It's often accused of having low power but, you know?

It, it's depends on what alternatives you're

searching for, and that sort of thing.

but it is another test for, very well known test for goodness of fit testing.

So that, today's lecture covers a

lot of chi squared testing in a nutshell.

And if you want to just remember a couple

of things make sure you remember that the observed

[UNKNOWN]

minus expected squared over expected.

Some of those guys is the statistic you know, remember the, the way in

which you got the expected cell counts

using independence that, that applies regardless of the,

of the, applies across all the settings that we consider today remember that the,

the degrees of freedom for the goodness of fit testing was a little bit different.

And the way in which you set up a goodness of fit

test was a little bit different.

but still uses the same, observed minus expected squared, over expected statistic.

Okay? And

we'll see you next time.

for the let's see what is this a I think the last lecture in this module.