The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

375 ratings

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 2

Kruskal's MST algorithm and applications to clustering; advanced union-find (optional).

- Tim RoughgardenProfessor

Computer Science

So just like in the Hopcroft Ullman analysis we're going to think about the

Â total amount of work done by our union fine data structure as constituting 2

Â pieces, work done visiting good nodes and work done visiting bad nodes.

Â The previous quiz says that we have a bound on visits to good nodes, even on a

Â per operation basis. At most inverse acronym for N, good nodes

Â are visited every single operation. What remains is to have a global bound on

Â the number of visits to bad nodes. The argument will be to show, that over

Â an arbitrary sequence of m find and union operations, the total number of visits to

Â bad nodes is bounded above by big O of n, times, alpha of n.

Â So here is the crux of the argument. Here is why, when you do a visit to a bad

Â node, the subsequent path compression massively increases the gap between that

Â object's rank and the rank of the parent. So, let's freeze the data structure at

Â the moment where we found the operation and makes a visit to a bad object,

Â call that Bad Object X. Let's think about what the world with the

Â data structure must look like in this scenario.

Â So we've got our bad object X, it may or may not have nodes pointing to

Â it. We're not going to care.

Â By virtue of it being bad and meeting the third criterion we know the rank of X is

Â at least 2. It's delta value could be anything.

Â Whatever it is, let's call it. Okay.

Â So because x is not a root, it has some parent, call that parent p.

Â And not only does x have ancestors, but because it meets the fourth criterion, we

Â actually know it has an ancestor y who has the same delta value as x.

Â That is it has an ancestor y With delta of y equal to k.

Â It is possible that p and y are the same object ir they could be different.

Â It's not going to matter in the analysis. So we're calling that the statistic delta

Â is only definied for non roots, we can conclude that y is also not the

Â root. It must then have some parent p prime.

Â P prime might be a root or it might not, we don't care.

Â [SOUND] So, now let's understand of the effect of the past compression, which is

Â going to happen subsequent to this find operation.

Â X's parent points is going to be rewired to the root of this tree.

Â The root of this tree is either at P prime, or even higher than that.

Â Given that fact let's understand how much bigger the rank of X's new parent P

Â primer higher is compared to the rank of it's old parent P.

Â So the rank of x's new parent is at least the rank of P prime.

Â So if p' is in fact the root, then the rank of x's new parent is just the rank

Â of p'. Otherwise, this new parnet is even higher

Â than P prime because its ranks only increase going up the tree, that means it

Â would only be higher than the rank of P prime.

Â How does the rank of P prime compare to its child, that of its child y? Well and

Â here is a key point, because delta of y is equal to k.

Â Remember what the definition of delta is, it quantifies the gap between the rank of

Â an object, and that of its parent. We are going to use that here for y, and

Â its parent P prime. It means the rank of P prime is so big

Â It's at least the function, a sub k applied to y's rank.

Â So our third and final inequality is the same as the first one.

Â So it could be the case that y actually is the same as p, it actually is x's old

Â parent. In that case the rank of x's old parent

Â is precisely the rank of y. Otherwise, y is even higher up X's old

Â parent P and therefore, by the monotonicity of rank, the rank of Y is

Â even bigger, than the rank of X's old parent.

Â So now in this chain of inequalities, I want you to focus on the left-most

Â quantity, and the right-most quantity. What does this say, at the end of the

Â day? It says when you apply path compression to X, it acquires a new

Â parent. And the rank of that new parent is at

Â least the A sub K function applied to the rank of it's old parent.

Â Again, path compression at the very least applies the A sub K function to the rank

Â of X's parent. So now let's suppose you visit some bad

Â object X, not just once, but the same object X, while it's bad over and over

Â and over again. This argument says that every such visit

Â to the object x while it was bad applies the function A sub k to the rank of its

Â parent. So in particular, let's use r to denote

Â the rank of this bad object x and, again, by virtue of it being bad, r has to be at

Â least 2. Imagine we do a sequence of R visits to

Â this object X while it's bad. Each of those visits will result in it

Â acquiring a new parent. And that new parents rank is much bigger

Â than the rank of the previous parent. It's at least A sub K, applied to the

Â rank of the previous parent. So, after R visits,

Â that's applying the function A sub K, R times to the rank of X's original parent

Â which forces rank at least that of X, at least R.

Â Well we have another name for applying the function A sub K, R times to R.

Â Remember this is just by definition of the acronym function A sub K plus 1.

Â Applied to r. So what does this mean? Well this means,

Â that after r visits, to a bad object x, that has rank r, the rank of x's parent

Â has to have grown, so much, that that growth is reflected in our statistic

Â delta, that measures the gap in the rank, between objects, and the objects parent.

Â So remember how we defined delta of x. It's the largest value of k so that the

Â rank of x's parent is even bigger than a sub k applied to the rank of x.

Â So what this inequality shows is that every r Parent pointer updates to x allow

Â you to take k to be even bigger than before.

Â You can bump up k by 1 and still have this inequality be satisfied.

Â That is, every r visits to a bad object of rank r, delta has to increase by at

Â least 1. But there's only so many distinct values

Â of the statistic delta of X can take on. It's always non negative, it's always an

Â integer, it can only increase, and it's never bigger than the inverse Ackermann

Â function alpha of N. So therefore, the total number of visits

Â to an object X, while it's bad, over an arbitrary sequence, can not be more than

Â R, the number of visits needed to increment delta of X,

Â times the number of distinct values, which is alpha of N, the inverse

Â function. So now we've done all the hard work.

Â We're almost there, we just need 1 final slide putting it all together.

Â All right, so to see how all of this comes together, let's first recall that

Â all that we need to do is bound the total number of visits to bad objects over this

Â entire sequence of union and find operation.

Â So in the previous slide, we showed that for a bad object x, with rank r, the

Â total number of times it's visited while it's bad, is bounded above by r times the

Â inverse Ackermann function of n. So lets sum that over all of the objects

Â to get our initial bound on the total number of visits to bad objects.

Â So now, we have to be a little careful here, because there are n objects x.

Â and ranks can be as large as log n. So a naive-bound would give us something

Â like n times LogN times alpha of n, and that's not what we want.

Â We really want n times alpha of n. So we need to use the fact that not all

Â big nodes can have big rank. And that's exactly what the rank Lemma

Â says. So to rewrite this, in a way that we can

Â apply the rank Lemma, bounding a number of objects with a given rank, lets bucket

Â the objects according to their rank. So rather than summing over the objects,

Â we're going to sum over possible ranks r, and then we're going to multiply times

Â the number of objects that have that rank R.

Â While I'm at it, let me pull the alpha of n factor, which doesn't depend on the

Â object out in front of the sum. For every value of R, the rank Lemma

Â tells us there are at most N/(2^R) objects with that rank.

Â So this factor of N is independent of the rank R, so let me pull that out in front

Â of the sum. And when the dust settles we get n times

Â the inverse acronym function of n times the sum of non-negative integer r of r

Â divided by 2 to the r. So, we've seen this kind of sum without

Â the r in the numerator, just a geometric sum 1/2^r.

Â We know that's bounded by a constant, and more generally throwing an r in the

Â numerator, well that's no match for this exponentially decreasing denominator.

Â So, this sum still evaluates to, a universal constant.

Â I'll let you check that in the privacy of your own home.

Â And so that's, give us a bound of O(n) * the inverse Ackermann function of n, on

Â the total number of visits to bad objects, since we also have a per

Â operation bound of alpha(n) on the number of visits to good nodes.

Â Combining the work of the two cases we get O(n) m+n times inverse Ackermann

Â function of n. And again, the interesting case here is

Â when m is omega of n. Otherwise, you can just do this analysis

Â separately for each 3 in the data structure.

Â And, there you have it. You now understand in full detail one of

Â the most amazing results in the history of algorithms and data structures.

Â So, Tarjans bound is unimagenably close to being linear without actually being

Â linear, it's off by this inverse Acromon function factor.

Â Now, from a practical perspective for any imagenable value of N, alpha of N is

Â almost 4, so this gives you a linear time bound for Imaginable values of n.

Â In fact, even the Hopcroft-Ullman log starbound, logs stars at most 5, for any

Â imaginable value of n. So that also is in, in essence linear

Â time bound for all practical purposes. But theorotically speaking, we have not

Â proven a linear time bound, on the amount of work done by the union find Data

Â structures. So the Pavlovian response, of any

Â algorithms researcher worth their salt, would be to ask, can we do better? And,

Â we could ask that question in a couple different senses.

Â The first sense would be; well maybe we could have a sharper analysis, of this

Â exact data structure. Maybe, union by rank, and path

Â compression is sufficient To gaurantee a linear amount of work.

Â After all, we didn't need to change the data structure, we only needed to change

Â the analysis, to sharpen the log* bound, to an alpha of n bound.

Â So this question, remarkably, Tarjan answered in the negative, in his original

Â paper. He showed that, if you use this exact

Â data structure, union by rank and path compression, then they are arbitrarily

Â large Sequences of unions and finds of arbitrarily large collections of objects,

Â so that this data structure actually performs asymptotically, M the number of

Â operations times the inverse Ackermann function, N the number of objects amount

Â of work. That is keeping the data structure fixed,

Â this analysis cannot be asymptotically improved.

Â This data structure fundamentally has Worst case performance, governed by this

Â insane inverse Ackermann function. So this is already a mind boggling fact

Â that indeed Tarjan in the conclusion of his paper, notes that it's remarkable to

Â have such a simple algorithm with such a complicated running time.

Â But you could also ask the question, could we do better perhaps by improving

Â or changing the data structure. After all, by adding path compression, we

Â got a qualitative improvement in the average operation time.

Â That dropped from log to alpha of n. Perhaps there's yet another optimization

Â out there waiting to be discovered that would drop the amount of work per

Â operation over a sequence down to constant time per operation, linear work

Â overall. Tarjan, in his paper made the bold

Â conjecture that there is no linear time method, no matter how clever you are.

Â And remarkably, that conjecture has since been proved.

Â It was proved for certain classes of data structures both in Tarjan in his original

Â paper and in And a follow-up paper in '79.

Â But the final version, of this conjecture, was proven by Friedman and

Â Sachs, back in 1989. No matter how clever you are, no matter

Â what union-find data structure you come up with, there will be arbitrarily long

Â sequences of operations, so that the average amount of work you do per

Â operation Is, the inverse Ackerman function of, the number of objects, in

Â the data structure. Pretty unbelivable,

Â so let me just wrap up with a historical comment.

Â So, it's a full disclosure, I wasn't quite alive when this result came out

Â but, in talking to, in reading about it, and talking to senior researchers about,

Â my sense is that it was really sort of a water-shed moment, for the field of data

Â structures and And algorithms. Specifically it confirmed the belief of,

Â people working in algorithms and data structures, that the field posessed

Â surprising depth, and beauty. There had, of course, been earlier

Â glimpses of this, we mentioned in the optional, material in part 1, Kanuth's

Â analysis of linear probing, back in the early 1960's.

Â But this was really something for the worst case analysis, of algorithms and

Â data structures. So the fact that such a practical and

Â naturally arising problem, in algorithms and data structures, requires,

Â necessarily, the understanding of the Ackermann function and it's inverse.

Â A function, mind you, which was first Proposed and defined back in 1920s in the

Â service of recursion theory, almost 10 years before Tarjan was doing his work on

Â models of computation. It was a real eye opener and it showed

Â that this field is something that will keep many generations of scientists quite

Â busy for many years to come.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.