0:00

Let's explore our second strategy for graph search, namely depth-first search.

And again, like with breadth-first search, I'll open by just reminding you what

depth-first search is good for and we'll trace through it in a particular example,

and then we'll tell you what the actual code is.

So if breadth-first search is the cautious and tentative exploration strategy,

then depth-first search or DFS for short is its more aggressive cousin.

So the plan is to explore aggressively and only back track when necessary.

And this is very much the strategy one often uses when trying to solve a maze.

To explain what I mean let me show you how this would

work in the same running example we used when we discussed breadth-first search.

So here if we invoke depth-first search from the node number S,

here's what's going to happen.

So obviously we start at S and obviously there's two places where we can go next.

We can go to A or to B.

And depth-first search is under determine like breadth-first search we can

pick either one.

Select with a breadth-first search example let's go to A first.

So A will be the second one that we explore.

1:03

But now, unlike breadth first search where we automatically went to node B next,

since that was the other layer one node.

Here, the only rule is that we have to go next to one of A's immediate neighbors.

We might go to B, but we're not going to B because it is one of the neighbor's of S,

we go because it is one of the neighbors of A.

And actually to make sure the difference is clear let's assume that we aggressively

pursue deeper when we go from A to C and now the depth first search strategy is

again to just pursue deeper, so you go to one of C's immediate neighbors, so

maybe we go to E next, so E is going to be the fourth one visited.

Now from E there's only one neighbor not counting the one that we came in on so

from E we go to D.

And D is the fifth one we see.

Now from D we have a choice, we can either go to B or we could go to C.

So let's suppose we go to C from D.

Well then we get to a node number three where we've been before.

And as usual we're going to keep track of where we've already been.

So at this point we have to back track from C back to D.

We retreat to D.

Now, there's still another outgoing edge from D to explore and

then they'll be the one to B.

And so what happens is we actually wind up wrapping all the way around this outer

cycle and we could B sixth.

And now, of course,

anywhere we try to explore we see somewhere we've already been.

So, from B we try to go to S, but we've been there so we retreat to B.

We can try to go to A but we've been there so we retreat to B.

Now we've explored all of the options out of B.

So we have to retreat from B, we have to go back to D.

Now from D we've explored both B and C so we have to retreat back to E.

We've explored the only outgoing arc D, so we have to retreat to C.

We retreat to A.

From A we actually haven't yet looked along this arc, but

that just sends us to B where we have been before.

So then we retreat back to A.

Finally we retreat back to S and S,

even at S there's still an extra edge to explore.

At S we say, we haven't tried this as B-edge yet.

But of course, when we look across we get the B where we've been before and

then we backtrack to S.

Then we've looked at every edge once, and so we stop.

So that's how depth-first search works.

You just pursue your path, you go to an immediate

neighbor as long as you can until you hit somewhere you've been before.

And then you retreat.

So you might be wondering why bother with another graph search strategy?

After all we have breadth-first search, which seemed pretty awesome, right.

It runs in linear time.

It's guaranteed to find everything you might want to find, it computes shortest

paths, it computes connected components if you embed it in a foreloop.

It kind of seems like, what else would you want?

Well, it turns out,

depth-first search is going to have its own impressive catalogue of applications,

which you can't necessarily replicate with breadth-first search.

And I'm going to focus on applications in directed graphs.

So there's going to be a simple one that we discuss in this video, and then there's

going to be a more complicated one that has a separate video devoted to it.

So in this video we're going to be discussing,

computing topological orderings of directed acyclic graphs.

That is, directed graphs that have no directed cycle.

The more complicated application is computing strongly connected components in

directed graphs.

The run time will be, essentially, the same as it was for

breadth-first search, and the best we could hope for, which is linear time.

And again, we're not assuming that there's necessarily that many edges.

There may be much fewer edges than vertices.

So linear time and these connectivity applications means O of M plus n.

4:17

So let's not talk about the actual code of depth for search.

There's a couple of ways to do it.

One way to do it is to just make some minor modifications to the code for

breadth for a search.

The primary difference being instead of using a queue in its

first in first out behavior, you swap in a stack with its last in first out behavior.

Again, if you don't know what a stack is you should read about that in the program

textbook or on the web.

It's something that supports constant time insertions to the front and

constant time deletions from the front,

unlike a queue which is meant to support constant time deletions to the back.

Okay so stack that operates just like those cafeteria trays that you know where

you put in a tray and the last one that got pushed in when

you take the first one out that's the last one that got put in.

So these are called push and pop, in a stack context both are constant time.

So if you swap out the queue you swap in the stack,

make a couple other minor modifications.

Breadth-first search turns into depth-first search.

For the sake of both variety and elegance,

I'm instead going to show you a recursive version.

So depth-first search is very naturally phrased as a recursive algorithm,

and that's what we'll discuss here.

So, depth-first search, of course,

takes as input a graph g and again it could be undirected or directed.

It doesn't matter, just with a directed graph be sure that you only follow arcs

in the appropriate direction, which should be automatically handled

in the adjacency lists of your graph data structure anyways.

So as always we keep a Boolean local to each vertex of the graph,

remembering whether we've been there before or not.

And of course, as soon as we start exploring from S we better make a note

that now we have been there.

We better plant a flag, as it were.

And remember, depth-first search is an aggressive search, so we immediately

try to recursively search from any of S's neighbors that we haven't already been to.

And now, if we find such a vertex, if we find somewhere we've never been,

we recursively call depth-first search from that node.

6:06

The basic guarantees of depth-first search are exactly the same as they were for

breath-first search.

We find everything we could possibly hope to find and we do it in linear time.

And once again, the reason is this is simply a special case of the generic

stretch procedure that we started this sequence of videos about.

So it just corresponds to a particular way of choosing amongst multiple crossing

edges between the region of explored nodes, and

between the region of unexplored nodes.

Essentially always being biased toward the most recently discovered explored nodes.

And just like, breadth for search, the running time is going to be proportional

to the size of the component that you're discovering.

And the basic reason is that each node is looked at only once, right?

This boolean makes sure that we don't ever explore a node more than once, and

then for each edge, we look at it at most twice, once from each endpoint.

6:53

And given that these exact same two claims hold for depth-first search as for

breadth-first search, that means if we wanted to compute connected components in

an undirected graph, we could equally well use an outer for

loop with depth-first search as our workhorse in the inner loop.

It wouldn't matter.

Either of those for undirected graphs, depth-first search, breadth-first search,

is going to find all the connected components in O of n plus m time,

in linear time.

So instead, I want to focus on an application in particular to depth-first

search, and

this is about finding a topological ordering of a directed acyclic graph.