0:04

Now, we've talked about how to simulate random numbers from

simple probability solutions.

But the question now is how, what if we want to assimilate data from a,

from a model.

So for example, like a linear model.

So I've got a fairly simple linear model here.

It has a single predictor, x and it's going to have random noise, what I

call epsilon that, that has a normal distribution with standard deviation two.

There is, the outcome is going to be generated by, by, use, using these two

regression coefficients around intercept beta knot and, and a slope beta one.

And I've got I'm going to assume that beta knot is equal to 0.5 and

beta 1 is equal to 2.

So the question is,

how do I simulate from this model now that I've specified what it is?

So I here, at first I set the seed.

It's always very important to set that seed.

So I set it to 20.

I generate x the predictor, which is, has a standard normal distribution.

I generate epsilon, which is going to have a standard

a normal distribution with mean zero of standard deviation two.

And then I'm going to add them all together by,

and after multiplying the regression coefficients to generate my y.

And so, from the summary here, you see that y has roughly a mean of 0.68.

And it ha, and it ranges from about minus 6 to plus 6.

And then I can plot the data to see what they look like.

And here they are on the next slide.

1:20

So this is the plot of the x that I simulated.

And the y that I simulated from the linear model.

And you can see that they very clearly have a linear relationship according that

follows the model that we specify

1:35

So just a slight variation of the previous example.

What if x is a, instead of x being a normal random variable,

what if x is a binary random variable, so member it, maybe it represents gender or

maybe it's some treatment versus control or something like that.

So here, and it's very simple, I can generate binary data from the,

using the binomial distribution and the rbinom function.

So, I set the seed again.

And I generate a 100 binomial random variables and

these are going to have these, this, this if from, this comes from

the binomial distribution which is n equals to 1 and p equals to half.

So, the probability of one is going to be equal to 0.5.

So I generate a hundred of those.

And then I generate my normal random variables.

My normal error term which is going to be mean zero and standard deviation two.

And then I add them all together which should produce my y.

So now I look at the summary of y.

I see the mean is about 1.4, and the range is about from minus 3 to six or seven.

So when I, now when I plot the data,

of course they'll look very different, because the x variable is binary.

But the y variable is still continuous, it's normal.

So here you can see that there's, there appears to be a pretty clear,

again, linear trend when, between going from x equals to 0 and x equals to 1.

2:50

Now suppose you want to simulate from a slightly more complicated model

a generalized linear model perhaps with a Poisson distribution.

And so, for example, you might want to simulate some outcome data that are,

that count variables, instead of continuous variable.

So we have to use a slightly more complicated approach, to do

this in particular, because the error distribution is not going to be normal.

It's going to be a a Poisson distribution.

And so, let's assume that the outcome y has a Poisson distribution with mean mu.

And that the log of mu follows a linear model with a intercept beta knot and

a slope beta one.

So x is going to be one of our predictors.

So let's assume that beta knot is 0.5.

And beta one is 0.3.

So how do we simulate from this model to get our Poisson on data?

So so we need to use the rpois function for this.

And so we first set the seed as always, and we generate our predictor variable, x.

Which is going to have a standard normal distribution.

Then we're going to simulate, generate our lin, linear predictor log of mu.

Which is just adding the slope and this, the intercept and

the slope coefficient times x.

So that's the log of our linear predictor.

But when we, but in order to get the mean for

our Poisson random variable, we need to exponentiate that.

So we, we simulate 100 of these Poisson random variables using the rpois function,

and we give it the ex, the exponential of our log mean.

4:08

So when we summarize this,

you'll see that the mean is about 1.5 and our range is between zero and six.

When I plot this data, you'll see that they look like Poisson data, and

that there's clearly a linear relationship between x and

y, as x increases, the count for y generally gets larger.

But the data are still count variables here.