Hi, and welcome back. In this video, we're going to study binomial and negative binomial random variables. Let's start with three examples. Suppose you toss a fair coin 12 times. What's the probability that you'll get exactly five heads? Second example. Suppose you pick a random sample of 25 circuit boards. You know that the long run percentage of defective boards is five percent. What is the probability that three or more boards in your set of 25 is defective? Third example. Suppose you work for an online company, 40 percent of online purchasers of a particular book want a new copy, 60 percent want a used copy. What is the probability that among 100 random purchasers, 50 or more used books are sold? These three situations and many more like them are modeled by a binomial random variable. Let's look at the key elements of each of these three situations. The first one is that we start with n Bernoulli trials. In the first example n is 12, in the second example, n is 25, and in the third example n is 100. Each of these trials is Bernoulli. Remember that's a success with probability p or a failure with probability one minus p. The other key element is we have the same probability of success on each trial. On the first example, p was a half. On the second example, we're actually counting the number of defectives. The defectives are the success and that occurs with probability 0.05. In the third example, the probability of buying a used copy is 0.6, so p is 0.6. Then the third piece, so same probability of success on each trial, and then the third key element is we have independent Bernoulli trials. Remember, independent means the outcome of one trial does not affect the outcome of another. Let's summarize these. The properties of a binomial random variable, we have n trials, and n is the number that's fixed in advance. The trials are identical and result in a success or failure with probability p or one minus p, and the trials are independent of each other. The notation we're going to use is X has the distribution of a binomial random variable with parameters n, the n trials, and p, the probability of success. Let's work on finding the probability mass function, the expectation, and the variance for a binomial random variable. Let's start with the sample space. That'll help us understand how to find the probability mass function. The sample space is going to be an n-tuple, like that, x_i is going to be a one if a success on ith trial and a zero if a failure. That's the sample space. Notice that the cardinality of S is two to the n. Each x_i has two outcomes, and so we have two to the n elements in S, but each event is not equally likely. We have to do a little bit more analysis to figure out the probability mass function. In probability X equals zero, what does that mean? That is, we have n failures in a row. That probability is one minus p to the nth power. It's to the nth power because they're all independent. When X equals 1, what happens there? We have 1, 1, and n minus 1, zeros. How many elements like that do we have? We have n of them. Probability P for the 1, and 1 minus P to the n minus 1 for the n-minus 1 zeros. What happens when x equals 2? Well, here we have 2 1's and the rest are zeros. We have 2 1's, and n minus 2 0's. If we think about this, we have n spots in the n-tuple. We put down 2 1's. Choose two places to put the 1's, and then all the rest are zeros. Then the probability of getting those 2 1's is P squared and 1 minus p to the n minus 2 for the zeros. We can generalize this. This is going to be, when X equaling k. In each element from the sample space, we're going to have k 1's, and we're going to have n minus k 0's. Think about how we can count how many elements there are like that. Well, there's n choose k. We have n spots. We put down the k1s and all the rest of the places are zeros. Then we have p to the k, so that's p to the k for the k 1's and 1 minus p to the n minus k. This is for k equals 0, 1 up to n. Observe something, if I sum from k equals zero up to n of all these probabilities, and choose k, p to the k, 1 minus p to the n minus k, we actually do get 1. That's from the binomial theorem, which you may have seen in pre-calculus or calculus. What about the expected value? Recall the definition of the expected value is we sum over all possible values of k. K times the probability that X equals K. For a binomial random variable, we're going to get the expected value of X, that's going to be the sum over all the possible values of k. K equals zero to n, k, n choose k, p to the k, 1 minus p to the n minus k. There's some series manipulation that goes on. We don't need to concern ourselves at this point on how to do that. The answer turns out to be np. I want you to recall a Bernoulli random variable with probability p has expected value p. It turns out, and we'll see this in one of the later modules, that if we think of X_1, X_2 up to X_n as independent. Bernoulli random variables, then binomial random variable X is actually the sum from 1-n of each of those Xs. We'll also see in the future that the expected value of X is the sum of the X [inaudible] , and that's going to be the np. You should be able to start to see the relationship between a binomial random variable that is n independent Bernoulli trials. The expectation of the binomial random variable as we saw here, is np. So that's n times the expected value of one individual Bernoulli random variable. That also works with the variance. We won't go into details at this point, but to compute the variance; X is binomial with parameters n and p. The variance of X, we could go through the formula, k equals 0-n. We have k minus the expected value of X squared times the probability n, choose k, p^k, 1 minus p^n minus k. Through a bunch of series manipulations, we get n times p times 1 minus p. I'll just point out this is the variance of a Bernoulli random variable. It all starts to fit together. Another random variable that I want to talk about, is a negative binomial random variable. Let's go with our three examples. Suppose now you're tossing a fair coin until you obtain five heads. How many tails do you get before the 5th head? Suppose you randomly choose those circuit boards again, but now we're going to choose circuit boards, we're going to continue to choose circuit boards until we get three defectives. Same with book purchasing, but now, how many books do we have to sell before the 50th used book? These three situations can be modeled by what's called a negative binomial random variable. Let's look at the key elements for this situation. Here, we're going to have independent Bernoulli trials until r successes. In this case, r is five. In the 2nd example, r is three and in the 3rd example, r is 50; we need to repeat our Bernoulli trials until we get that many successes. Also, with a negative binomial, we're going to count the number of failures until the r^th success. The biggest difference between a binomial random variable and a negative binomial is the following: in a binomial random variable, the number of Bernoulli trials you're going to do, you fix the n ahead of time and then you count the number of successes that you obtained, with the negative binomial random variable, you fix your r. How many successes do we need? So you fix your r ahead of time. Everything else is the same. The trials are identical and result in a success or failure with probability p or 1 minus p, and the trials are independent of each other The only difference with a binomial random variable, you fix the n ahead of time and you count the number of successes. With the negative binomial, you fix the number of successes and count how many failures you get before that. The notation we're going to use is NB for negative binomial, r is going to be the number of successes you want to wait for, and p is the probability of success. Let's do a little example, and as we do this example, we'll calculate the probability mass function. Suppose you have a physician, they want to recruit five people to participate in a particular medical study. Let p equal 0.2 be the probability that a randomly selected person agrees to participate. What is the probability that 15 people must be asked before five are found who will do the study? All right. We're going to let y be the number of failures before the five people are found. Let's look at the sample space. We're going to have x_1, x_2, and so on. X_i is going to be one if a success on ith trial and a zero if a failure on the ith trial, and this is the important part. We're going to sum over all of the x_i's and we have to get five. So that's going to account for our five successes. If we look at y equaling 0. So y equals 0 means there were no failures. The probability of the event of five ones in a row, that's going to be 0.2^5 power. Let's look at probability of y equaling 1. That means there's one failure. So we could have a failure and then five ones. A one, then a failure, then four ones, and so on. Notice, there can't be a failure on the 6th trial because we're counting the number of failures until the 5th success. So with one failure, we can only put it in those five spots. The probability, five spots, I'm going to choose four of them from my ones, and then the fifth spot is my zero. We're going to get 0.2^5, and we'll get 0.8^1 power. Think about probability of y equals 2. Now, we have six spots. We're going to have two zeros and four ones. Then in the seventh spot here, we've got to have a one. How many of these events are in the sample space? Well, we have six choose four. So we have six spots, we choose four of them for our ones, the other two are zeros. Altogether, we have five successes, so that's the 0.2^5 and we have 0.8 squared for our failures. In general, y equals k. Here, we're going to have k zeros, four one's, and then the last space right here is a one. So we have k plus 4, and I'll just write that as k plus 5 minus 1. Choosing four, for our ones, we have 0.2 to the 5th power. Those are the successors, and we have 0.8^k, those are the k failures. For our problem with k equaling 15, we just put in 15 in here for our k to calculate the requested probability. Notice there is no end, so k equals 0, 1, 2, it can go up to infinity. Let's summarize, so y has a negative binomial distribution, r is the number of successors we want to wait for. P is the probability of success. The probability mass function is y equals k, that's k plus r minus 1. Choosing r minus 1 spaces for the ones, p^r, 1 minus p^k power. The expected value of Y, you can calculate as r times 1 minus p over p and the variance of Y is r times 1 minus p over p squared. I want to talk a little bit about the relationship between a geometric random variable and a negative binomial random variable. Let's go back and recall. When we have a geometric random variable with probability p, this is going to repeat independent identical Bernoulli trials until the first success. I want to look at Y negative binomial one comma p. This is going to count the number of failures until the first success. If you think about that, hopefully, you will see that Y is counting the number of failures. It doesn't count that last success. It counts the number of failures. So Y is the same as X minus 1. Then the expected value of Y is the expected value of X minus 1, which is 1 over p minus 1, which is 1 minus p over p. With a negative binomial, r comma p, what we're really doing is we're putting together a bunch of geometrics. These are all failures, and then we get a success. Then we do some more, and we get some more failures. We get success, some more failures and success up until the rth success. When we add all of these geometric random variables, so we have one geometric random variable, then we stack on another one, then we stack on another one, and we do that r times. Then what we're getting is we're getting this r times the 1 minus p over p. The same is true for the variance. The variance of a geometric is 1 minus p over p squared. When we have r geometrics stacked one after another, the variance becomes r times 1 minus p over p squared. That concludes the discussion for a binomial and a negative binomial random variable. In the following module, we'll begin our study of continuous random variables. See you then.