0:00

We now turn to considering various approaches for

solving the traveling salesman problem.

As usually we start with probably the most naive possible approach, okay,

so namely we're going to consider an approach that just

enumerates all possible candidate solutions.

And among all of them select the best one.

This is known as a brute force such as a brute force approach.

So first of all,

let's try to understand what is our set of all candidate solutions.

So what I claim is that what we are looking for is more or

less a permutation of our n given nodes.

So why am I saying more or less, well, let's see.

I assume that we have just four nodes.

So when drawing on these nodes, if I were draw that, but

at the same time we have all possible edges between them.

And these seems that would draw the rectangle, for example.

So what we need to find is the order of visiting these nodes.

One such order is for example bdca.

This corresponds to the following.

So this is a permutation, right, and

of course it defines a cycle.

So from b we go to d, from d we go to c, from c we go to a, and

then we return back to a, and to b, I'm sorry,

because this is the first time I'm going to follow permutation.

At the same time, this correspondence is not one to one,

namely in our permutation there is a first element,

of course b while in the cycle there is no first element.

So this actually means that for one cycle there are m

different permutations where m is the number of nodes,

meaning there are m different permutations which give the same cycle.

So for example,

the permutation dcab also gives rise to the cycle shown here on the slide.

And also cabd.

And of course abdc.

2:13

So they all give the same cycle,

because they basically are a response to the, for example,

dcab is the same cycle where we start from the vertex d from the known d, okay?

So but still roughly when we're looking for a cycle we're looking for

a permutation, okay, or for a permutation with some first node fix.

This basically tells us that the number of different cycles is n minus one factorial.

So n factorial is the number of all permutations of n objects.

But because the first, we can assume, that the first node is fixed, right?

So if the first node is fixed.

Then for the second node, there were n- one choices.

For the next one, there were n- 2 choices.

For the next one there were n- 3 choices, and so on.

So the total number of different cycles is n- 1 factorial.

3:15

Which is almost the same as n factorial, okay?

And unfortunately, the n factorial function grows extremely fast.

Okay, so just to show you how it looks like note that for

example n factorial when n is equal to 5 is equal to roughly 100.

So this is, let me just remind you, this is just 2 times 3 which is 6,

times 4 which is 24 and times 5 which finally gives you 120.

Okay, this is a reasonable number.

We can definitely implement the programs that goes to 100 possible solutions and

does this very quick.

However, when n grows, n factorials gets longer and longer and longer.

For example, equal to 20 factorial

definitely more than ten decimal digits.

And when n is equal to 30, n factorial is a huge number.

So definitely our modern laptops are not able to perform so

many operations per second, not even per second,

I am not saying, and even not in a hour, or even not in one year.

4:32

And now the reason working for n factorial at the rations will not,

is not going to be practical even for m equal distillity unfortunately.

So this also tells us that the search space

in case of the traveling salesman problem is huge.

It is just, it is impossible to explore the whole search

phase to find the best solution, okay?

So and this tells us that we definitely need some smarter approaches.

But before going into such approaches, let's explore also the following idea.

We've just realized that is not not affordable to go through

all possible permutations, but what about just selecting the random permutation?

Just let's just pick a random permutation out of all possible permutations and

hope that it is going to give a not so long cycle.

5:31

So unfortunately, it turns out, that the length of a random permutation

may be much worse than the length of the best permutation

even if we concede the special case of Euclidean TSP.

So as I mentioned already in case of Euclidian TSP sometimes

solving in such instances is easier.

In particular because they satisfy this triangle inequality, okay?

And it is not actually too difficult to understand why it

can't be that bad, but it can also be shown formally.

So the landmark that you see here on this side actually tells us how

to compute the weight for random permutation.

When I say the weight, I mean, of course, the expected weight, okay?

So the expected weight of random permutation is equal just to

the sum of the weights of all edges divided by n- one.

And here I asked you to, a directed graph.

Okay, and formally, this can be proved as false, so

in a cycle, we've got something, we have exactly m edges.

And in a complete directed graph, we have m times n minus one inches.

And since everything is symmetric, each hash has probability

1/(n-1) to be included into a random cycle, okay?

And then, which means that for each hash, each contribution

to the expected lengths is its weight divided by m- 1.

Okay, so when this since expectation is linear,

we can compute the total weight as the sum of expectation of all the edges,

which gives us the sum of all edges divided by m- 1.

If you would like a more formal proof you can obviously follow as,

is a total number of, is a total number of different cycles that

we've discussed, is n factorial, right, is n- 1 factorial.

Now, let's do the following, let's,

fix a particular edge, for example, u,v.

7:54

And let's count the number of cycles that contain this particular edge, okay?

Since as you remember, since we have a cycle,

we are free to select its starting point.

So let's just assume that we have a cycle and we are counting the number of edges,

as the number of cycles that contain the edge, I'm sorry, the u.

Let's assume that u is the starting point of the cycle and

since it has here we know that there is u and v here.

And for all the remaining nodes, we are free to select any particular permutation.

So any of n- 2 nodes here, any of n- 3 nodes here, and so on.

So the total number of cycles that contain this

particular edge at n- 2 factorial.

8:52

So the fraction of false cycles contain in a particular edge is actually 1 over n- 1

and this is also a probability of attached to be included in the remnant cycle, okay?

And this, let me erase now everything and let's discuss this.

This means that even though we graph, we have some very heavy, heavy edge,

then it will give the contribution through random permutation also.

So great, its weight will give a contribution to the weight of

a random permutation, and this is something that we would like to avoid.

Because probably there a remaining heavy edges in our graph, but

what we're looking for is some cycles that avoids all the heavy edges, right?

9:44

Probably, a more interesting example is the following.

You can see that I'm just this simple data set.

So we have some bunch of points here at the bottom.

At the bottom left corner and some bunch of points here at the top right corner.

So of course for this particular example it is not so

difficult to convince yourself that the bad cycle looks like this, right?

So we just need to somehow traverse all the nodes here in this corner.

Then we go to the opposite corner and traverse all the nodes here.

Then we go back.

Namely so, there are two groups of point in different

10:44

So it will take too many long catches and that's exactly why

a random permutation may be much worse than the optimum one, okay?

So in the next videos, we are going to consider smarter approaches for

solving the traveling salesman problem.

In particular,

we will start with an approach that can easily solve this particular data set.

And you probably guessed over yourself

there is a very reasonable heuristic to solve this particular one.

So from each node that lets go to the nearest

node which hasn't been visited yet.

So this will solve this particular data set immediately.

Right, because starting from z node we will go up to that one to

that one to that one and to that one.

The next node is this one, the closest one,

this one will leads us to here, here, here and finally here.

And so in sense this dataset is not difficult to solve.

So in the next video, we're going to explore this approach further.