In the rest of the lecture, we'll put together the ideas that we developed for our basic version of the approximate point query data structure, to get a full float approximate point query. And finally, the count sketch algorithm that gets good results, good approximations to frequency counts with high probability and small space. I will start by providing the intuition for the improvements that we have to make to the original version of approximate point theory. Then I will define the algorithm itself and finally give some intuition for the analysis of the parameters. Okay, we will recall that our basic version of ApproxPointQuery suffered from two issues. Well first, the precision of our estimates depended on the variance of a certain estimator, and that variance sometimes was bad. For example, when several dominant coefficients were present in the same string. To fix this deficiency, what we do for our final algorithm is run our basic estimator that we already designed on subsampled or hashed version of the stream of data items that we get us in. The reason why this works is that this operation as we will show later reduces variance of our estimates. Now the second problem with our basic version of ApproxPointQuery was that the success probability was fairly low. It would only succeed with high constant probability. On the other hand, our final algorithm has to provide good estimates for all data items that appearing in streams simultaneously. So to fix this deficiency and make the algorithm run with high probability, we introduce the idea of repeating our estimates independently many, many times and taking medians of answers that they report to boost our confidence. Okay, so the main idea of this algorithm, and the first idea that makes this work, is the idea of hashing the input stream into a number of buckets. So again, here's our data items, we associate them with integers between one and some large number, for our examples, we use 1 through 10. What we do is we choose a hash function that maps the data items to a number of buckets. So in this particular case, on the slide you see a random mapping of data items 1 through 10 into 8 hash buckets, b is number of hash buckets, it's equal to 8. The mapping is shown as the arrows, so for example, element number 1 is mapped into bucket 3 and element number 1 also happens to collide with element number 6 in bucket 3. Element 10 on the other hand is residing alone in bucket 2, it doesn't collide with any other element. Good, now this idea of hashing let's us define the notion of hashed streams. In particular, for every item i, we will say that the hashed stream of item i contains all the other elements that collide with i under hash function h. An example, the subsampled or hashed stream of item 1, under this particular hash function shown on the slide consists of 2 items, item 1 itself and item 6, because they collided in bucket 3. The subsampled or hashed stream of item 5 consists of two items, 5 and 7, because 5 and 7 collided in bucket 5. Now it is important to note that we are hashing the universe of data items into a number of buckets, and not positions in the stream. Let me give a specific example. So this is the stream that we started with at the beginning of this lecture. So it contains occurrences of items from 1 to 10, and here is the frequency histogram. Under the hash function from the previous slide, the subsample stream of item 1, which is the heavy hitter, one of the dominant items, most frequent item on this stream, is shown at the bottom of the slide. It consists of all the occurrences on 1 in the stream and all the occurrences of 6. The frequency histogram for the stream is shown at the bottom of the slide. The crucial observation is that this frequency histogram looks exactly like the good case for our basic estimator from the previous part of the lecture. Indeed, if we were to estimate the frequency of item 1 from this particular stream then the arrow would only be contributed by the item number 6. Which means that we would actually get a good estimate, because item number 1 dominates the subsampled or hash stream. Okay, but item number 1 was a heavy hitter, it was the most frequent item on the string, but see what happens for other items. So for example if you look at the subsampled hashed stream of item number 5, all that consists of two items, 5 and 7. Just all occurrences of these items and 5 occurs twice and 7 occurs once, the frequency histogram again is shown at the bottom of the slide. So note that if we were to estimate item number 5 from this particular stream, we would actually get a very good estimate. Because this estimate would not be contaminated by occurrences of any of the heavy hitters in the original stream. And this is the basic idea for why our new version of the algorithm will actually work well.