In this video, I'll prove to you that Dijkstra's algorithm does indeed compute to correct shortest paths in any directed graph where all edge links are non negative. So let me remind you about what is Dijkstra's algorithm, it's very much in the spirit of our graph search primitives, in particular breath first search. So there's going to be a subset x of vertices, which are the ones that have been processed so far. Initially x contains only the source vertex. Of course the distance from the source vertex to itself is 0, and the shortest path from s to itself is the empty path. So then we'll have a main while loop, that's going to be n-1 iteration, and each iteration will bring one vertex which is not currently in x into capital X. And a variant that we are going to maintain, is that all the vertices in x we will have computed estimates of the shortest path distance from x to that vertex and also we'll have computed the shortest path itself from x to that vertex. Remember our standing assumption stated in the previous video, we're always going to assume there's at least one path from the source vertex s to every other destination v. Our job is just to compute the shortest one. And also, we have to assume that the edge links are non negative as we've seen otherwise Dijkstra's algorithm might fail. Now, the key idea in Dijkstra's algorithm is a very careful choice of which vertex to bring from outside of x into capital X. So what we do is we scan the edges crossing the frontier. Meaning given the current edges vertices that we've already processed, we look at all of the edges whose tail has been processed and whose head has not been processed. So the tails in capital X, the head is outside of x that is, they cross the cut from left to right in the diagrams that we usually draw. Now, there may be many such edges. How do we decide amongst them? Well, we compute the Dijkstra's greedy score for each. The Dijkstra greedy score is defined as the shortest path distance we computed for the tail and that's been previously computed because the tail's in capital X. And then we add to that the length contributed by this edge itself by the edge vw which is crossing the cut from left to right. So amongst all edges crossing the cut from left to right we compute all those Dijkstra greedy scores we pick the edge with the smallest greedy score Calling that edge just v* w*. For the purposes of notation, w* is the one that gets added to x. So it's the head of the arc for the smallest three score, and then we compute the shortest path distance of that new vertex w* to be the shortest path distance to v* plus the length contributed by this edge v* w* and then was the shortest path. It's just the shortest path previously computed to v star plus this extra edge v* w* tacked onto the end. Here's the formal statement we're going to prove. For this video, we're not going to worry at all about running time, that'll be the discussion of the next video. We'll discuss both the running time of the basic algorithm and a super fast implementation that uses the heat data structure. For now, we're going to just focus on correctness. So the claim is that for every directed graph, not just the four node, five arc example we studied. As long as there's no negative edge links, Dijkstra's algorithm works perfectly. It computes all the correct shortest path distances. So just to remind you about the notation, what does it mean to correct all shortest path distances correctly? It means that what the algorithm actually computes, which is a(v), is exactly the correct shortest path distance, which we were denoting by L(v) in the previous video. Both the algorithm and the proof of correctness where established by Esther Dijkstra this was back in the late 1950s. Dijkstra was a Dutch computer scientist, and certainly one of the forefathers of the field as a science, as an intellectual discipline. He was awarded the ACM Turing award, so that is the Nobel Prize in computer science effectively. I believe it was 1972, and he worked a long time in the Netherlands, but then also spent a lot of his later career at UT Austin. So the way this proof is going to go is going to be by induction. And basically, what we're going to do is we're going to say every iteration, when we have to commit to shortest path distance to some new vertex, we do it correctly. And so then the form of the induction will be, well given that we made all of our previous decisions correctly, we computed all our earlier shortest paths in the correct way. That remains true for the current iteration. So formally, it's induction on the number of iterations of Dijkstra's algorithm. And as is more often than not the case in proofs by inductions the base case is trivial. So that just says before we start the y loop, what do we do? Well we commit to the shortest path distance from s to itself. We set it to 0, we set the shortest path to be the empty path, that is of course true. Of course, even here we're using the fact that there are no edges with negative edge length. That makes it obvious that sort of having a non empty path can get you negative edge length better than 0. So the first choice path computation we do s to s is trivially correct. The hard part of course is the inductive step justifying all of the future decisions done by the algorithm. And of course, mindful of that example that not example we had at the end of the previous video in the proof by induction, we'd better make use of the hypothesis that every edge has non negative length. Otherwise the theorem would be false. So we better somewhere in the proof use the fact that edges cannot be negative. So let's move on to the inductive step. Remember in the inductive step, the first thing to do is state the inductive hypothesis. You're assuming you haven't made any mistakes up to this point. Let's be a little bit more formal about that. So that is everything we computed in the past. What did we compute in the past? Well for each vertex which is in our set capital X for each vertex that we've already processed, we want to claim that our computed shortest path distance matches up exactly with the true correct shortest path distance. So in our running notation, for every already processed vertex, so for all vertices v, in our set capital X. What we computed as our estimate of the shortest path distance for v is in fact, the real shortest path distance. And also, the computed shortest path is in fact, a true shortest path from s to v. So again remember, this is a proof by induction. We are assuming this is true, and we're going to certainly make use of this assumption when we establish the correctness of the new iteration, the current iteration. What happens in an iteration? Well, we pick an edge which we've been calling v* w*. And we add the head of sets w* to the set X. So let's get our bearings and remember what Dijkstra's algorithm computes as the shortest path and shortest bath distance for this new vertex w*. So by the definition of the algorithm we assign as a shortest path from s to w*. The previously computed purportedly shortest path from s to v*, and then we tack on the end the direct arc, v* w*. So pictorially, we already had some path that started at s and ended up at v*, and then we tack on the ends this arc going to w* in one hop. And this whole shebang is what we're going to assign as B of w*. So let's use the inductive hypothesis. Inductive hypothesis says that all previous iterations are correct. So that is any shortest path we've computed in a previous iteration is in fact a bona fide shortest path from the source x to the vertex. Now v*, remember is in x. So that was previously computed. So by the inductive hypothesis, this path bv*, from s to v*, is in fact a true shortest path from s to v* in the graph. So therefore, it has length l of v*. Remember, l of v* is just by definition the true shortest path distance in the graph from s to v*. Now, given that the path that we've exhibited s to w*, is just the same one as we inherited the v* plus this extra edge tacked on. It's pretty obvious what the length of the left hand side is. It has length, just the length of the old path which we just argued is the shortest path distance from sw* plus the length of this arc that we tacked on. That's going to be lv* w*. So by the definition of the algorithm, what we compute for w* is just the Dijkstra's greedy score which is just the computer choice path distance to the tail. The v* plus the length of the direct edge. By the inductive hypothesis, we've correctly computed all previous choice path distances. V* is something we computed in the past by inductive hypothesis is correct. So this is equal to l of v* by the inductive hypothesis. So don't worry if you're feeling a little lost at this point. We've actually really done no content in this proof yet. We haven't done the interesting part of the argument. All we've been doing is setting up our dominoes, getting them ready to be knocked down. So what have we done in the current iteration? Well first of all, our estimate of the shortest path distance from the source to w*, to the new vertex that we're including in the set capital X, is the true shortest path distance to v* plus the length of the edge from v* to w*, that's the first thing. Secondly, the path that we have in the v array is a bona fide path from s to w* with exactly this distance. And the point is, now it's clear what has to be proven for us to complete the inductive step and therefore, the proof of correctness of Dijkstra's algorithm. So what do we need to proof? We need to proof that this isn't just any old path that we've exhibited from s to this vertex w*, but that it's the shortest path of them all. But differently we need to show that every other sw* pattern in this graph has length at least this circled value. So let's proceed let's show that no matter how you get from the source for test to this destination w*. The total length of the path you travel is going to be at least this circled value, at least L(v*) + lv*w*. Now on the one hand, we don't have a lot going for us, because this path P could be almost anything. It could be a crazy looking path. So how do we argue that it has to be long? Well, here's the one thing we've got going for us for any path P that starts in s and goes to w*. Any such path must cross the frontier. Remember, it starts on the left side of the frontier, it starts at the source vertex, which is initially and forever in the set capital X. And remember that we only choose edges that cross the frontier whose head is outside of x. And w* is exactly the head of the edge we chose in this iteration, so this is not an x. So any path that starts in x and goes outside of x at some point it crosses from one to the other. So let's think about the graph and it's two pieces, that it's the left of the front tier and not to the right. The stuff is already processed and the stuff which is not been processed. S of course, is on the left hand side, and at the beginning of this iteration of the while loop, w* was on the right hand side. Any path, no matter how wacky has to at some point, cross this frontier. Maybe it does it a bunch of times, who knows but it's gotta do it once. Let's focus on the first time it crosses the frontier, and let's say that it crosses the front here with the vertex y going to the vertex z. That is any path P has the form where there's an initial prefix,where are the vertices are in x. And then there's some first point at which it crosses the frontier and goes to a vertex which is not an x, we're calling the first such vertex outside of x, z. And then it can skip back and forth who knows, but certainly it ends up in this vertex w* which is not an x. So we're going to make use of just this minimal information about an arbitrary path P. And yet this will give us enough of a foothold to lower bound its length. And this lower bound to be strong enough, we conclude that our path that we computed is the best, smaller than any possible competitor. So let's just summarize where we left on the previous slide. We established that every directed path from s to w* p, no matter what it is has to have a prescribed form, where it ambles for a while inside x and then the portal through which it escapes x for the first time we're calling y. And then the first vertex it sees outside of x is z and there has to be one. And then it perhaps ambles further and eventually reaches w*. It could well be that z and w* are exactly the same, that's totally fine for this argument. So here's one of our competitors, this path p and I have to show it's at least as long as our path. So we need a lower bound on the length of this arbitrary path from s to w*. So let's get that lower bound by arguing about each piece separately, and then invoking Dijkstra's greedy criterion. So remember, we said we better use the hypothesis that edge links are non negative, otherwise we're toast, otherwise we know the algorithm is not correct. So here's where we use it. This final part of the path from z to w*, if it's not empty then it's gotta have non negative length right. Every edge as part of this subpath has non negative edge length, so the total length of this part of the path is non negative. So y to z by construction is direct arc. Remember, this is the first arc that path p uses to go from x to get outside of x. So that's how it escapes, the conquer territory x and this just has some length, l of yz. So that leaves the first part of this path, the prefix of this path that lies entirely in capital X. So how do we get a lower bound in the length of this path? Well, let's begin with something trivial. This is some path from s to y, so certainly it's as least as long as a shortest path from s to y. And now, we're going to use the inductive hypothesis again. So this vertex y, this is something we treated in a previous iteration. This belongs to the set capital X, we've already processed it, we've already computed our estimate of it's shortest path length, and the inductive hypothesis assures us that we did it correctly. So whatever value we have hanging out in our array capital A, that is indeed the length of the true shortest path. So the length of the shortest sy path is l(y) by definition, and it's A(y) by the inductive hypothesis, and now we're in business. So what does this mean we can say about the total length of this arbitrary path P? Well, we've broken it into three pieces and we have a lower bound on the length for each of the three pieces. Our lower bounds are, our computed shortest path distance to y, the length of the direct edge from y to z and 0. So adding those up, we get that the length of path P is at least our computed shortest path distance to y plus the length of the arc from y to z. So why is this useful? Well, we've got one remaining trick up our sleeve. There's a hypothesis which is presumably very important, which we have not yet invoked. And that is the choice of Dijkstra's greedy criterion at no point in the proof yet have we used the facts that we select which vertex to add next according to Dykstra's greedy score. So that is going to be the final nail in the coffin, that's what's going to complete the proof. How do we do that? Well we have taken an arbitrary path P, we have lower abounded it's length, in terms of the computed shortest path distance up to the last vertex of this prefix y plus the arc length to get from x to l set of X, zyz. So remember, this means y is on the left part of the frontier and z is not. And therefore in this iteration, the edge yz was totally a candidate for us to use to enlarge our frontier. Remember, we looked at all of the edges crossing from left to right. Yz is one such edge and amongst all of them, we chose the one with the smallest Dijkstra's greedy score. That was the Dijkstra's greedy criterion. So what have we shown? We've shown that the length of our path is no more than what's a lower bound on the length of this arbitrary other path P. So this completes the proof. So let me just remind you of all the ingredients, in case you got lost along the way. So what we started out with is we realized our algorithm or Dijkstra's algorithm it does compute some path from s to w*. It just takes the path it computed previously to v*, and it just depends this final hop at the end. So that gives us some path from s to w* moreover, it was easy to figure out exactly what the length of that path is. And the length of the path that we came up with is exactly the circled quantity at the bottom of the slot. It's the shortest path distance from s to v* plus the length of the direct arc from v* to w*. So that was how well we did. But we had to ask the question, is it possible to do better? We're trying to argue that our algorithm does the best possible, that no competing path could possibly be shorter than ours. So how did we do that? Well, we considered an arbitrary competing path P. The only thing we know about it is that it starts at s and it ends up at w*. And we observe that any path can be decomposed into three pieces. A prefix, a direct edge, and a suffix. Then we give a lower bound on this path P. The direct edge, the length is just whatever it is. The suffix, we just use the trivial lower bound that's at least 0. And that's where we use the hypothesis that every edge has non negative edge length. And for the prefix, because that's all in the stuff we already computed, we can vote the inductive hypothesis and say, well whatever this path is, it goes from s to come vertex and y. So at least the shortest path distance from s to y which is something we computed in a previous iteration. We lower bounded the length of any other path in terms of the Dijkstra's greedy score for that path. Since we choose the path with the best greedy score, that's why we wind up with the shortest path of them all, from s to w*. This of course, is embedded in an outer proof by induction on the number of iterations, but this is the inductive step, which justifies a single iteration. Since we can justify every iteration giving correctness to the previous ones. That means by induction, all of them are correct. So all of the shortest paths are correct. And that is why Dijkstra's algorithm correctly computes shortest paths and any directed graph with non negative edge lengths.