It's important that research in algorithms does not grind to a halt, just

because we have this open, and presumably extremely difficult, P versus NP problem

that we don't yet understand. We still want guidance about which

problems are easy, and which problems are hard.

So a pretty great idea, for giving us such guidance, guidance,

while the same time evading, actually answering the p versus np problem, is to

show that problems are difficult in a relative sense.

A mass evidence of intractability by showing that a problem is at least as

hard as, many other computational problems.

To make this idea relative difficulty precise, we need to say what we mean,

that one problem is as hard as another. The language for that is a reduction from

one problem to another. We say that a computational problem, say,

pi 1, reduces to another one, pi 2, if all you need to solve pi 1 efficiently is

an efficient algorithm for pi 2. That is, if I gave to you on a silver

platter a polynomial-time algorithm for the problem pi 2,

then using that as a subroutine, you could build a polynomial-time algorithm

for pi 1. So, we'll see some concrete examples in

just a sec, but let me just mention, I'm being slightly informal with this

definition. You'll see something a little bit more

formal if you look at a text book or take a class specifically on this topic, but

this is how you should think about reductions.

All you need to solve pi 1 is someone handing you an efficient subroutine for

pi 2. That is the algorithmic.

Way of thinking about reductions. Indeed, by virtue of being immersed in

algorithms over the past many weeks, we have examples of reductions.

The next quiz is going to ask you to recall some of them.

So the correct answer is, all of the above.

A, B, and C are all legitamite, reductions from one computational

problem, to another. So for example, answer A, this is

something we discussed back in part 1 of the course, when we first motivated

linear time median algorithms. At that point we were arguing that for a

median algorithm to be interesting, it better be faster than n log n time.

That's why we were shooting for linear time.

The reason being that one perfectly correct algorithm for computing a median

is you take the input array, you sort it, for example, using merge sort, n log in

time, and you return the middle element. That's going to be the median.

The second answer, b, so this is something we mentioned in passing, in the

naive implementation of Kruskal's algorithm, where we wanted to, check

cycles, and we were thinking about adding a new edge.

And we observed that, one way to check if a graph has a cycle, is you just run

depth first search on that graph. If there's a cycle, you'll detect it by

exploring an edge, which points backward to a vertex that you're, still in the

middle of exploring. The third example is maybe more

interesting, in the sense that we, invoke the provided sub-routine more than once.

When we first introduced the all paired shortest path problem, we observed that

one solution, and in certain cases is actually a very good solution, is to just

invoke a single-source shortest path subroutine, like Dijkstra's Algorithm or

the Bellman-Ford algorithm, n times, once for each choice of the source.

But again, it's still a reduction given an efficient algorithm, a polynomial time

algorithm. For the single-source shortest path

problem, you just run it n times, and that gives you a polynomial type

algorithm for the all-pairs shortest path.

Math problem. In fact, seasoned algorithm designers,

and seasoned programmers, are always looking for opportunities to employ

reductions. So when someone describes you a

computational problem, the very first thing you should try to think through is,

is this a problem I already know how to solve, just merely in disguise? So maybe

if you have to make sequence of decisions, maybe you can phrase it as a

shortest path problem, a problem that you already know excellent

solutions to. And if that fails, if the computational

problem you're confronted with isn't literally the same, as one you already

know how to solve, maybe by invoking known primitives, known

algorithms, a small number of times, that's efficient to solve this new

computational problem that you're confronted with.

So all of these are what I think of as honorable uses of reduction.

So somehow, the light side of the force. It spreads the frontier of ability to

cover the things we can do with algorithms wider and wider.

Now, for NP completeness, for establishing computational intractability

we have to turn this world on its head. This relies on a perverse use of

reductions, which I'll explain next. So suppose that the problem pi 1 reduces

to pi 2, in the sense that all you need, to solve pi 1 in polynomial time, is a

polynomial time algorithm for pi 2. That's the only missing piece.

Now, as algorithm designers, you're probably thinking about the happy case,

where we have a polynomial time algorithm for pi 2 sitting on the shelf.

We have something like Dijkstra's algorithm, or Bellman-Ford algorithm, and

we take it off the shelf, and we use it combined with this reduction, to solve pi

1. But, let's think about the contrapositive

of what it means when 1 problem reduces to another.

And now, lets think about the case where, we don't actually believe that it's

possible to solve pi 1 in polynomial time.

Well, if we don't think it's possible to solve pi1 efficiently, and pi1 reduces to

merely computing pi2 efficiently, well then we certainly can't believe that it's

possible to compute pi2 efficiently either.

That is, if pi 1 reduces to pi 2, and it just so happens that pi 1 cannot be

solved in polynomial time, then, pi 2 cannot be solved in polynomial time

either. So this is the perverse use of a

reduction, that I mentioned earlier. This is the dark side of the force.

Rather than using a reduction to enlarge the frontier of tractability from pi2 to

include pi1 also, we're spreading the frontier of intractability from pi1 to

pi2. So this then is what we mean formally

when we say that 1 problem is at least as hard as another.

If Pi 1 reduces to Pi 2, what does that say? It says Pi 2 allows you to solve Pi

1, maybe lets you do other stuff as well, so therefore Pi 2 is at least as hard as.

As pi 1, so we now have the vocabulary to say that 1 problem is at least as hard as

another one, and let's remember what our plan was.

Our plan was to amass evidence of the intractability of the Traveling Salesman

problem by saying it's at least as hard as lots of other stuff, so to pass from

as hard as a single. Little problem to a big set of problems.

We're going to talk about the notion of completeness.

Formally consider some set C of problems and it's going to be tricky to get the

right set of problems, we'll discuss that in detail in next video.

But for now, just let C be any old set Problems.

We call a computational problem pi complete for C.

We call it C complete, if, it's at least as hard as everything in C.

If everything in C reduces to pi. Oh, and also, this problem pi? It should

be in this class itself. So, the second constraint here is the

more important one. It's the one that fits more directly into

our narrative. Remember we wanted to, have vocabulary

for saying one problem is as hard as a whole bunch of other stuff.

C is the bunch of other stuff so, pi is at least as hard as everything in C. That

is, everything in C reduces to pi. One nice reason, to have this first

constraint also to assume that the problem pi is a member of C, is that

gives the following nice interpretation. We can think of this problem pi as being

a hardest problem in the set C. It's in C, and it's at least as hard as

everything else in the class. So we're starting to see a confluence,

between our strategic plan, amassing evidence for the intractability of tsp

by, arguing it's as hard as a bunch of other stuff, and, the mathematical

formalism. Right? This notion of completeness gives

us a language to say that a problem is as least as hard as a bunch of other stuff.

So, how should we define all this other stuff? That is what is a suitable choice

for the class c? Well, we'd like to present the strongest evidence, the most

compelling case possible, of the traveling salesmen problem intractable,

so that says we should strive for the biggest set c we can find.

The more stuff This problem is as hard as the greater the evidence of its

contractability. So, we want to prove it complete for a

really big set c. Well, why not be ambitious and reach for

the stars? Why don't we just show that the travelling salesman problem is the

hardest problem in existence. Its C complete, even if we take the set C

to be all computational problems. Well, this is a good starting point, but

it turns out this is too ambitious. We're not going to be able to prove this

fact. There are, indeed, computational problems

out there, that are certainly, strictly harder, strictly more difficult to solve,

than the traveling salesman problem. One I hope you've heard of is the halting

problem. So in the halting problem, its very

simple. You're given as input some code, say a

program written in C, and you're given an input, and the responsibility of the

algorithm is to decide whether or not, the provided program will halt,

eventually, when you run it with the provided.

Input. Perhaps the obvious approach to the

halting problem is to just simulate the provided program, in an interpreter, in

effect, on the supplied input. But here's the problem.

Suppose you've been simulating the provided program on the provided input,

and you've been running for say ten hours.

Maybe you suspect that things in an info, an infinite loop is never going to halt,

but how do you know? Maybe if you ran it for one more hour, it would halt.

What if you've been running it for 10 years?

Maybe now you're pretty sure it's in an infinite loop and never going to halt,

but how do you know? Maybe it's going to halt tomorrow.

Problem of not knowing when you're done, is an issue not just with the naive

simulation approach to the problem, but with any possible approach to this

computational problem. Indeed, Turing proved, in his landmark

1936 paper, that there is no algorithm, however slow Guaranteed to always

correctly solve the halting problem.So this is one of those theorems which is

simultaneously, at least in my opinion, very deep but also the proof is just like

4 lines. You prove it by contraction, you assume

that an algorithm for the halting problem does exist and then using the code for

that program you construct. A program that, on the one hand, can't

halt, and on the other hand, can't not halt, obviously a contradiction.

So the purported algorithm for solving the halting problem can exist.

So, and this is the kind of argument I find just Totally baffling.

I like to recall Von Norman's quote that, there's some things in mathematics that

you don't ever understand, you just get used to.

And this kind of diagonalization argument inspired by Kant, was originally,

original argument, that there's an uncountable number of real numbers, I

think fall squarely in that category. Anyways the point is that the traveling

salesman problem no longer looks quite that bad.

And we now realize that our original thought to prove that the TSP problem is

at least as hard as every other problem is a little misguided, is a little too

ambitious. So, it's certainly not at least as hard

as the halting problem. The halting problem is what is called

undecidable. There's no algorithm for it whatsoever.

Whatever time. Rather than I give you.

Where as the TSP problem by contrast, if I give you big enough time bound roughly

n factorial for n vertices, you can solve the problem namely using our favorite

punching bag Brute Force search. You can just try every one of the finite

possibilities and pick the best one. But the high level idea at the, top of

this slide, that is proving the travelling salesman problem complete, for

a big class of problems is, still a good one.

We just have to refine our understanding of the suitable choice of the class, c.

In particular By virtue of being solvable by brute force search, there are certain

problems that we cannot prove that TSP is at least as hard as.

But maybe we can prove that amongst all problems which can be solved, using brute

force search, amongst such problems, TSP is the hardest one.

Now to execute this idea we'll of course have to define what it means from a

problem to be brute force solvable. That would motivate the complexity class

NP, the subject of the next video.