[SOUND] So we talked about PageRank as a way to capture the assault. Now, we also looked at some other examples where a hub might be interesting. So there is another algorithm called HITS, and that going to compute the scores for authorities and hubs. The intuitions are pages that are widely cited are good authorities and whereas pages that cite many other pages are good hubs. I think that the most interesting idea of this algorithm HITS, is it's going to use a reinforcement mechanism to kind of help improve the scoring for hubs and the authorities. And so here's the idea, it was assumed that good authorities are cited by good hubs. That means if you are cited by many pages with good hub scores then that inquiry says, you're an authority. And similarly, good hubs are those that point at good authorities. So if you pointed to a lot of good authority pages, then your hubs score would be increased. So then you will have literally reinforced each other, because you have pointed so some good hubs. And so you have pointed to some good authorities to get a good hubs score, whereas those authority scores would be also improved because they are pointing to by a good hub. And this is algorithms is also general it can have many applications in graph and network analysis. So just briefly, here's how it works. We first also construct a matrix, but this time we're going to construct an adjacent matrix and we're not going to normalize the values. So if there's a link there's a 1, if there's no link that's 0. Again, it's the same graph. And then we're going to define the hubs score of page as the sum of the authority scores of all the pages that it appoints to. So whether you are hub, really depends on whether you are pointing to a lot of good authority pages. That's what it says in the first equation. In the second equation, we define the authorities of a page as a sum of the hub scores of all those pages that appoint to you. So whether you are good authority would depend on whether those pages that are pointing to you are good hubs. So you can see this forms iterative reinforcement mechanism. Now, these three questions can be also written in the metrics format. So what we get here is then the hub vector is equal to the product of the adjacency matrix and the authority vector, and this is basically the first equation. And similarly, the second equation can be returned as the authority vector is equal to the product of a transpose multiplied by the hub vector. Now, these are just different ways of expressing these equations. But what's interesting is that if you look at the matrix form, you can also plug in the authority equation into the first one. So if you do that, you have actually eliminated the authority vector completely and you get the equations of only hubs scores. The hubs score vector is equal to a multiplied by a transpose multiplied by the hub score again. Similarly, we can do a transformation to have equation for just the authorities also. So although we frame the problem as computing hubs and authorities, we can actually eliminate one of them to obtain equation just for one of them. Now, the difference between this and page random is that now the matrix is actually a multiplication of the adjacency matrix and it's transpose. So this is different from page rank. But mathematically, then we will be computing the same problem. So in HITS, we typically would initialize the values. Let's say, 1 for all these values, and then we would iteratively apply these equations, essentially. And this is equivalent to multiply that by the metrics a and a transpose. So the arrows of these is exactly the same in the PageRank. But here because the adjacency matrix is not normalized. So what we have to do is after each iteration we're going to normalize, and this would allow us to control the growth of value. Otherwise they would grow larger and larger. And if we do that, and that will basically get HITS. That was the computer, the hubs scores, and authority scores for all the pages. And these scores can then be used in branching just like the PageRank scores. So to summarize in this lecture, we have seen that link information's very useful. In particular, the anchor text is very useful to increase the text representation of a page. And we also talk about the PageRank and page anchor as two major link analysis algorithms. Both can generate scores for web pages that can be used in the ranking function. Note that PageRank and the HITS are also very general algorithms. So they have many applications in analyzing other graphs or networks. [MUSIC]