0:03

So welcome back, recruits, to Lecture two of mathematical biostatistics boot camp,

and now we're going to talk about random variables.

So random variables as the slide says is it's simply a numerical outcome of an

experiment. They're, random variables are just

variables like you see in calculus, but they have probability distributions

associated with them. We're only gonna talk about two kinds of

random variables, discrete or continuous random variables.

Discrete random variables are any random variables that can take just a countable

number of possibilities. So even if it's infinite, if you can

enumerate. The collection of values that a random

variable can take, it's going to be discreet.

So, what I mean by even if it's infinite, if you have a random variable that is the

number of people that show up at a bus stop.

Well, I suppose there is a theoretical limit to that but it might be useful

mathematically to model that as if that count can go all the way up to infinity.

But the point is that we can count them. We can count one people, two people, three

people, four people and so on. If the random variable can take any

possible value on the real line or a subset of the real line.

Then we'll call the random variable continuous.

And we'll have to have a slightly different treatment of continuous versus

discreet random variables. But I think when we go through it we'll

try to draw the similarities between how they're treated.

This list of random variables either being discreet or continuous is non exhaustive.

You can actually have. A random variable that is both discrete

and continuous. To give you an example, let's think of a

good way to generate a continuous random variable, or something at least that could

conceptually be viewed as a continuous random variable.

Suppose you were to. Draw a line on a piece of paper that goes

from zero to two. You label one end of the line zero, and

another end of the line two, and then you drop your pencil, and it hits a random

point in between zero and two. And you were to measure the distance

between zero to two, how far that point is.

Well, you maybe you can argue, you can only measure to so find of a scale, but

lets forget about issues like that. Honestly you would model this as a sort of

continuous random variable, that distance you can measure to several decimal places

and so we will think of it, as sort of a continuous one.

So lets think about how could you possibly generate a random variable that's both

discrete and continuous. Well let's suppose, so we have our little

experiment where we can generate something that's continuous, or continuous enough to

think of it as continuous, and then we have another example.

Say, just. Rolling a die that generates one, two,

three, four, five, six, clearly generates a discrete random variable.

Suppose then if you were to flip a coin, if the coin comes up heads, you use your

pencil to get this continuous random variable, and if the coin comes up tail,

you use this die to get your discrete random variable.

Will the resulting random variable could have possibly been continuous, or it could

have been discrete. So that random variable, if we were to

describe its behavior, we would have to describe it as potentially being both

discrete and continuous. So we won't deal too much with random

variables like that. I would add that they're not entirely

useless. I'll, let me give you an example of a

random variable that's kind of. Both discreet and continuous but is, used

in practice. So imagine if you're looking at,

expenditures of some sort. Let's say you were an insurance company

and you were looking at how much you had to pay out, in terms of insurance.

Well for some people that never got sick you paid out zero.

It's a discreet number. Exactly zero.

For everyone else, you may be paid out a certain amount and that remainder would

probably be best modeled by a continuous random variable because you have to

account for it down to the fraction of a penny or something like that.

So in that case, if you were the insurance company and were evaluating the

distributional behavior. Expenditures.

You might want to model that with a random variable that can both take the discrete

value zero and can take the continuous values for all expenditures beyond zero.

So, any rate, this is a long-winded discussion.

Of random variables that we're not gonna consider in this class, one's that are

both discreet and continuous. But I just wanted to raise the point that,

the. Kinds of random variables that we're gonna

describe, are non exhaustive. Let's go through some simple examples of

variables that can be thought of random variables.

So if you flip a coin the head or tale zero or one outcome of a coin flip is

clearly a random variable. If you roll a die.

The one two three four five six outcome from rolling a die could be modelled as a

random variable. And again I should say could be modelled

as a random variable. If you say it is a random variable then

you can end in this discussion of well is a coin flip really random?

Maybe if you knew exactly how much pressure the person applied to the coin,

you know. We're not gonna worry about that kind of.

Extremely conceptual thinking in this class.

We are going to say practically we would like to model a coin as random,

practically we would like to model a dye as if it were random.

But practically we would like to model a lots of other things as if they were

random too. So for example, we have a random selection

of subjects and we take their body mass index at baseline and then take it four

years later, we might want to model that change in bmi or the bmi after a amount of

time as being a random variable. Same thing with hypertension.

We might want to model their hypertension status whether they have hypertension or

not as a random variable, and. This latter point also reminds us of why

coin flipping is very important? Coin flipping seems like a trivial random

variable but it forms the basis for lot of analysis.

We think of a lot of things as if there were coin flips, so we might model for

example, the prevalence of hypertension. We might think of the data going into that

modelling as if there were bunch of coin flips and the idea of coin flip will help

conceptualize the model that we are formulating.