[MUSIC]

In the previous section,

we derived our first sampling distribution of the sample mean x bar.

Now keep in mind that the sampling distribution is simply a probability

distribution of some descriptive statistic.

In our case the sample mean x bar.

So relating this back to our work in week 2.

Where by we derive some simple probability distributions i.e.,

we have sample space of some random variable x and

to each value we attribute it its probability of a current.

From that point we were then in a position to work out the expectation

of that random variable.

And remember the technique for doing so was to take a probability weighed average.

i.e., take each value of x, multiply it by its probability of occurence, and

then sum these across all values of the variable.

Well, for our sampling distribution of x bar, we don't deviate from this.

We have a random variable, the sample mean.

Taking different values depending on which random sample is observed and

we have the probabilities of occurrence for each.

So if we do, for example, 3.5, the smallest observed value of x

bar multiplied by its probability of occurence of 1 over 15 and

then proceed to do this across the entire distribution,

we will find that the expectation of x bar is equal to 6.

Now you may recall 6, in terms of thousands of pounds,

represented the population mean.

Now this is a fascinating and very useful result.

Which says that the expected value of the sample mean is equal to the true

population mean, i.e., the true value of the parameter of interest.

Now this is always going to be the case when we take some simple random sample

from some wider population.

Now remember the correct interpretation of an expected value.

We thought of it as a long run average.

So in this word of sampling distributions, we should think of it as follows.

Suppose we took repeated samples of the same size from the same population,

then inevitably from one sample to another,

we would tend to get different members forming that random sample.

And hence when we calculate a descriptive statistic

such as the sample mean as we are considering here.

We clearly see that different samples lead to different values of that sample mean.

So hence there is variation in this sample means which could be observed.

But what this results that the expectation of x bar is equal to Mu,

means is that on average, our sample mean is equal to

the true population mean, i.e., the very parameter we're trying to estimate.

Of course, we should stress the on average.

This doesn't mean for

any specific observed sample, our sample mean is equal to the true value.

In the example from the previous section, that only would have occurred in

one special case if we observed individuals A and D.

So clearly on average is different from saying we get this result

every single time.

So we know that any point estimate we get, any observed sample mean, may or

may not be equal to the truth.

On average we are right.

But, in any specific sampling situation, there is a risk of sampling error.

Whereby, even though we may have a simple random sample free of that selection bias.

There is a risk,

that by chance, our sample doesn't fully represent the population.

And hence, the characteristic of the sample, here x bar,

the sample mean deviates from the true population mean.

But nonetheless the fact that the expectation of x bar is equal to mu is

a very powerful result for us.

So now let's consider the concept of sampling distributions more generally

because, of course, in that previous example,

we had a very simplified case where the population only consisted of six members.

So you may recall, we've introduced the normal distribution.

And at the time I said it was arguably the most important distribution in statistics

for a variety of reasons, one of which was that the normal distribution

can represent many naturally occurring phenomena.

So let's imagine we are interested lets say in the heights of human beings.

And I think a normal distribution would adequately capture and hence adequately

model the two distribution of heights which we observed in the real world.

Now of course humanity has a very large population size indeed.

We don't even know exactly.

How many people there are alive on the face of the earth.

We have an approximation of around about 7 billion but it's extraordinary to

think that we don't actually know how many people there are precisely.

But nonetheless, we have a very large population.

Imagine we would like to know the average height of all human beings.

Clearly, it would not be feasible to measure the heights of

every human being on earth.

I don't have the time, patience, or money to undertake that exercise.

So inevitably, we would have to take, let's say,

a random sample drawn from our population and use the characteristics of

the sample to estimate the corresponding characteristics in the population.

But we know different random samples will lead to different constituent

members of those samples, and hence, the value of,

let's say, our sample mean would vary from one sample to another.

Well, the good news is, when we are sampling from a normal distribution,

one can actually show, which means we won't actually derive it here, but

just take my word for it that the true,

theoretical probability distribution of x bar also follows a normal distribution.

Now we know a normal distribution has two parameters.

Its mean and its variance.

So distinguish between X, let's say the height of human beings which we will

model as following a normal distribution, with an unknown mean mu,

and also an unknown variance, sigma squared.

But from this, let's suppose we take a random sample of size N.

So what would be the theoretical distribution of the sample mean x bar?

Well, it can be shown that x bar would also follow a normal distribution,

whose mean is mu, i.e., the same as the mean of that wider population.

And a variance of sigma squared divided by n.

So, here, the variance of this sampling distribution is sensitive,

i.e., it depends on the value of our sample sum.

So we see that the sample size is in the denominator of the variance of x bar.

Which means as the sample size gets bigger, as we get an increasingly larger

sample, the variance of x bar becomes smaller, as we might expect.

Because if you have a larger sample size, that means you have more observations,

more information, and hence one would expect that this will lead to more

precise, more accurate estimates of the population mean.

So if we could perhaps visualize some sampling distribution of x bar.

Whereby we simply vary the value of the sample size N.

So let's consider an example where,

suppose the true population had a normal distribution with a mean, mu,

equal to 5, and a variance sigma squared equal to 1.

So as we vary the sample size N,

what would the sampling distribution of x bar look like?