[MUSIC] Let's consider the notion of expectation in the general case. Suppose we have a random variable, and we simply set a presentation. Let's say this is angle variable on the probability space of size 4, so there are 4 possible outcomes. The probabilities of outcomes are denoted by p1, p2, p3, and p4. And the values of the random variable on these outcomes are a1, a2, and a3, and a4. Okay, let's repeat the random experiment many times. So we have these outcomes around our experiment and, for example, the outcome was this one. Then, in the next experiment, the outcome was the first one. Then, again, the third one, then the second one. Then the second, the first one, the third one again, the fourth one, so you can run an experiment many times. And in the end, we will obtain some results. So we have repeated the experiment n times, and n is some large number. Okay, how many outcomes do we have in each of these piles, in each of these columns? We cannot say for sure, but we can say approximately, and this approximation will be worth a high probability. Since each experiment ends up in the first outcome with probability p1, and we run experiments so we have approximately p1 times n in the first column. We have p2 times n in the second column and so on. Okay, let's see what is the average value of our variable on all of these outcomes? So we have n experiments and an outcome is sub i happens about p sub i times n times. So this is the number of times approximately that a sub i will occur among the results of our experiments. So to compute an average, we have to add up all the results and divide by the number of values. So we have n experiments, so we have n in the denominator. In the numerator, we will have a1 approximately p1 times n times a2 approximately p2 times n times and so on. So we add all this up, we obtain this expression, and note that n cancels out. So as a result, we have the following expression, a1p1 + a2p2, and so on. Okay, this fellow is called the expectation of f, and is denoted by Ef. Note that this value doesn't depend on n. During the whole argument, we had n as the number of experiments, but in the end, it cancelled out. So this number doesn't depend on n, it depends only on the random variable. And what is this number? This is an approximation to what we would expect as an average outcome if we repeat our experiment many, many times. The same construction works in the general case. So if now our random variable has key values a1 and so on ak, and the probability is ip1 and so on pk, then to compute the expectation, we have to multiply ai time pi for all i. And we have to add up the results from 1 to k. This will result in the expectation of our random variable. Why expectations are important? First of all, expectation is a number, so it is a numerical characteristic of a random variable, and this is important and convenient. We can compare numbers, we can add up numbers, we can do a lot of things, their expectations. And on the other hand, this is an important characteristic of a random variable. This is the average value, and it reflects important properties of a random variable, and it is very useful. To get more intuition of our expectations, let's also discuss geometric interpretation of an expectation of a random variable. Suppose we have a random variable f with four values a1, a2, a3, and a4 with probabilities p1, p2, p3, and p4, and its expectation is equal to a1p1 + a2p2, and so on. Okay, let's do the following. Let's consider a system of coordinates, and let's consider on horizontal axis point 0 and 1. And let's consider interval between 0 and 1. Now, recall that the probabilities, p1, p2, p3 and p4 add up to 1. The probability should always add up to 1. We can do the following. We have an interval from 0 to 1 of length 1. Let's break it into four intervals of length, p1, p2, p3, and p4, respectively. Now, let's consider the following graph of sum function. On the interval p1, the function is equal to a1. On the interval p2, the function is equal to a2. This is here is the correspondence to the vertical axis. On the interval p3, the function will be equal to a3. On the interval of length p4, the function will be equal to a4. There's an intuitive correspondence between this function and our random variable f. Just intuitively, let's say that we throw a random point in the interval from 0 to 1. This is not the formal explanation. This is just an intuition because formally, we do not know what means to throw point, point into interval. But let's Intuitively save it, we throw a point into an interval from 0 to 1. And all positions, so the points are equal so we do it uniformly. Then what is the probability to get into the first interval? This probability will be p1, this probability is the fraction of the first interval, takes into the large interval from 0 to 1. Since the interval from 0 to 1 is often 1, then this fraction is just the length of the first interval, so it's p1. So the probability p1, our point will get into the first interval, the probability p2. Our point will get into the second interval with probability p3. It will get the third interval and with probability before, it will get into the last interval. Now, let's look at the value of this function on this random point. In interval p1, this function is equal to a1. So with probability p1, our function will be equal to a1. With probability p2, our function will be equal to a2. With probability is p3, our function will be equal to a3. And with the last interval with probability p4, our function will be equal to a4. And this function behaves on this random point, which we throw into this interval exactly in the same way as our function f. So we can speculate that this is a graph of our function. Of course, this is just an intuition, this is not a formal argument. But we can think of this picture as a graph of our function. Now, where is the expectation on this graph? And it turns out, it has some very specific meaning on this graph. It turns out that the expectation is the area below this graph. It's the area of these four, sum of areas of these four rectangles. And it is easy to see if you look at the definition of the expectation. Note, what is the area of the first rectangle? Its dimensions are p1 and a1. So the area is the product of a1 times p1. And this is the first summand in the expectation. For the second rectangle, it's area is p2 times a2. And this is the second summit in our expectation. And we perceive in the same way, you solve the sum of areas of all these rectangles is equal to the expectation of f. So if you think of f as a graph or this function, then the expectation is the area below the graph of this function. Okay, we have discussed expectations, and actually they are used a lot in various fields. For example, they are everywhere in statistics and sociology. If you hear something about average age in some country or life expectancy, these are actually just expectations. These are expected values, you pick a random person and see its age. So this is a random variable, and average age is the expectation of this random variable. So the same with average grades or average evaluations of students. [MUSIC]