Well, now that we have been able to show that the expectation of our randomized estimate for fi is exactly fi. We would like to bound the variance and show that our randomized estimate for fi actually is close to fi with high probability. So, one way to show this is to use Chebyshev's inequality, which is a quantity aversion of the law of large numbers. Chebyshev's inequality says the following. That if we have a random variable X with mean mu and variance sigma squared. Then the probability that this random variable deviates from its mean by more than a factor t times its standard deviation sigma is at most, 1/t squared. So, our plan is as follows. We would like to apply Chebyshev's inequality to our randomized estimator. So, we want to let X, the random variable, equal the counter times the sine of i, where i is a fixed date item. And to do that, we need to bound the variance, because we know that the mean is fi already. So, let's move onto variance. For that, we want to recall the expressions that we have for our estimator. We know that the counter times s(i) converting as fi plus the contributions from the other data items j distinct from i, f(j) times its sine times the sine of i. We also know that the expectation of the counter times s(i) is exactly fi. So, what we want to bound is the variance of our estimator, which is nothing but the expectation of the sum over j different from I that is contribution from other elements in the universe. Of their frequency times their sine times their sine of i in try quantity squared. Now it turns out that this is not too hard to compute which we will now do. Before we apply the expectation, I want to design a somewhat simpler form that will be easier to apply the expectation to. So the counter times s(i)- f(i) the entire thing squared can do it in estimation over J different from I. J prime, different from I of the products of the frequencies, fJ and fJ prime times the product of sine times the sine of i squared. Now of course the sine of i squared is nothing but 1. So, what we get is a double estimation over a pair of items j and j prime that are distinct from i of the product of their frequencies and the product of their sines. So, now if we take the expectation of this quantity, we'll see that the following happens. If we look at a term in this double summation that corresponds to j different from j prime. Then the expectation of the product of the sines is exactly zero by the same token in our mean analysis. So, what we get is that all terms with j distinct from j prime are zero in expectation. And so the variance, is exactly the summation of all data items in the universe other than i of their frequency squared. Okay, so we have been able to construct a randomized estimate for the frequency of any fixed item i. And we have been able to balance variants, so we would like to now conclude by Chebyshev's inequality our estimate is good. Meaning that our estimate is close to its mean the frequency of i with high probability. But in order to do that, we need to check that the square root of the summation over other data items j of their frequency squared is actually small compared to the frequency of our item i. Now, this is not generally true. This depends on the actual stream, and on the actual item that we're trying to estimate. So, let's look at some examples, because these examples will show when our estimate works well, and when it does not. This will inform us in using our basic estimate that we just constructed as a building block for the full algorithm in the rest of the lecture. Okay, so this is our expression for the variance. And again, think of item I as fixed, this is the item that we're trying to estimate. Now suppose for a second that our stream of data items has the following frequency histogram shown on the slide. In this histogram, item number 1 which is the most frequent item in the stream dominates the stream. Well, if that is the case, and if we're trying to estimate item one, then our estimate should work fairly well. Indeed, our estimate works well exactly when the frequency of item i is larger than roughly the square root of the sum of squares of other frequencies. On the other hand, if this is our stream and instead of estimating item one. We're trying to estimate our item six, our estimate is quite poor because the error in estimating item six will include a contribution from item one. And that item dominates the stream, so that is a problem. So, to summarize, this very, simple basic estimate that we constructed works really well if we're trying to estimate elements that dominate the stream. And works quite poorly when we're trying to estimate elements that don't. Especially in the presence of very large elements in the stream. Okay, and so now our final algorithm. Our full version of ApproximatePointQuery and the final CountSketch algorithm will fix these deficiencies of our best basic estimate. The way our algorithm works is as follows. In order to find the top k elements approximately, what we will do is this. We're first given our data string we'll hash the data items in our universe into k buckets. Well, this can be viewed as the process of constructing substreams. Now we will run our basic estimate that we just designed on every substream that is obtained via these hashing posts. Now finally to boost the confidence of our estimates, we'll repeat this procedure a logarithmic number of times. And this will be our algorithm. That the main intuition for why this works is exactly this. This hashing process will allow us to insure and when we're estimating a large item, then most of the substreams that we estimated from will look like the one on the top of the slide. We'll show that our current sketch algorithm will be estimating the large items mostly from substreams that are dominated by these large items. Now, if you want to estimate a small item, you should also be able to figure out that this item has a small frequency. The algorithm, as we will show, actually estimates the small items mostly from streams that look like this, that don't have any of the large items contaminating the estimate. And in the next part of the lecture we will see the details of the algorithm and some of the main ideas behind the analysis.