Okay, so let's move on, and actually discuss the pseudo-code for the

merge sort algorithm. First, let me just tell you the pseudo-code, leaving aside

exactly how the merging subroutine is implemented. And thus, high levels should

be very simple and clear at this point. So there's gonna be two recursive calls, and

then there's gonna be a merging step. Now, I owe you a few comments, 'cause I'm being

a little sloppy. Again, as I promised, this isn't something you would directly

translate into code, although it's pretty close. But so what are the couple of the

ways that I'm being sloppy? Well, first of all, there's, [inaudible], you know, in

any recursive algorithm, you gotta have some base cases. You gotta have this idea

that when the input's sufficient. Really small you don't do any recursion, you just

return some trivial answer. So in the sorting problem the base case would be if

your handed an array that has either zero or an elements, well it's already sorted,

there's nothing to do, so you just return it without any recursion. Okay, so to be

clear, I haven't written down the base cases. Although of course you would if you were

actually implementing, a merge short. Some of you, make a note of that. A couple of

other things I'm ignoring. I'm ignoring what the, what to do if the array has odd

lengths, so if it has say nine elements, obviously you have to somehow break that

into five and four or four and five, so you would do that just in either way and

that would fine. And then secondly, I'm ignoring the details or what it really

means to sort of recursively sort, so for example, I'm not discussing exactly how

you would pass these subarrays onto the recursive calls. That's something that

would really depend somewhat on what, on the programming language, so that's

exactly what I want to avoid. I really want to talk about the concepts which

transcend any particular programming language implementation. So that's why I'm

going to describe algorithms at this level okay. Alright, so the hard part relatively

speaking, that is. How do you implement the merge depth? The recursive calls have

done their work. We have these two sort of separated half the numbers. The left half

and the right half. How do we combine them into one? And in English, I already told

you on the last slide. The idea is you just populate the output array in a sorted

order, by traversing pointers or just traversing through the two, sorted

sub-arrays in parallel. So let's look at that in some more detail. Okay, so here is

the pseudo-code for the merge step. [sound] So let me begin by, introducing

some names for the, characters in the, what we're about to discuss. So let's use

C. To denote the output array. So this is what we're suppose to spit out with the

numbers in sorted order. And then, I'm gonna use a and b to denote the results of

the two recursive calls, okay? So, the first recursive call has given us array a,

which contains the left half of the input array in sorted order. Similarly, b

contains the right half of the input array, again, in sorted order. So, as I

said, we're gonna need to traverse the two, sorted sub-arrays, a and b, in

parallel. So, I'm gonna introduce a counter, i, to traverse through a, j to

traverse through b. I and j will both be initialized to one, to be at the beginning

of their respective arrays. And now we're gonna do. We're going to do a single pass

of the output array copying it in an increasing order. Always taking the

smallest from the union of the two sorted sub arrays. And if you, if there's one

idea in this merge step it's just the realization that. The minimum element that

you haven't yet looked at in A and B has to be at the front of one or the two lists

right so for example at the very beginning of the algorithm where is the minimum

element over all. Well, which ever of the two arrays it lands in -- A or B -- it has to be

the smallest one there okay. So the smallest element over all is either the

smallest element A or it's the smallest element B. So you just check both places,

the smaller one is the smallest you copy it over and you repeat. That's it. So the

purpose of K is just to traverse the output array from left to right. That's

the order we're gonna populate it. Currently looking at position I, and the

first array of position J and the second array. So that's how far we've gotten, how

deeply we've probed in the both of those two arrays. We look at which one has the

current smallest, and we copy the smallest one over. Okay? So if the, if, the entry

in the i position of A is smaller, we copy that one over. Of course, we have to

increment i. We probe one deeper into the list A, and symmeterically for the case

where the current position in B has the smaller element. Now again, I'm being a

little bit sloppy, so that we can focus on the forest, and not sort of, And not get

bogged down with the trees. I'm ignoring some end cases, so if you really wanted to

implement this, you'd have to add a little bit, to keep track of when you fall off,

either, either A or B. Because you have additional checks for when i or j reaches

the end of the array, at which point you copy over all the remaining elements into

C. Alright, so I'm gonna give you a cleaned up version, of, that pseudo-code

so that you don't have to tolerate my questionable handwriting any longer than

is absolutely necessary. This again, is just the same thing that we wrote on the

last slide, okay? The pseudo-code for the merge step. Now, so that's the Merge Sort

algorithm. Now let's get to the meaty part of this lecture, which is, okay, so merge

sort produces a sorted array. What makes it, if anything, better than much simpler

non divide and conquer algorithms, like say, insertion sort? Other words, what is

the running time of the merge sort algorithm? Now I'm not gonna give you a

completely precise definition, definition of what I mean by running time and there's

good reason for that, as we'll discuss shortly. But intuitively, you should think

of the running time of an algorithm, you should imagine that you're just running

the algorithm in a debugger. Then, every time you press enter, you advance with one

line of the program through the debugger. And then basically, the running time is

just a number of operations executed, the number of lines of code executed. So the

question is, how many times you have to hit enter on the debugger before the,

program finally terminates. So we're interested in how many such, lines of code

get executed for Merge Short when an input array has n numbers. Okay, so

that's a fairly complicated question. So let's start with a more modest school.

Rather than thinking about the number of operations executed by Merge Sort, which

is this crazy recursive algorithm, which is calling itself over and over and over

again. Let's just think about how many operations are gonna get executed when we

do a single merge of two sorted sub arrays. That seems like it should be an

easier place to start. So let me remind you, the pseudo code of the merge

subroutine, here it is. So let's just go and count up how many operations

that are gonna get used. So there's the initialization step. So let's say that

I'm gonna charge us one operation for each of these two initializations. So let's

call this two operations, just set i equal to one and j equal to one then we have this four

loop executes a total number of end times so each of these in iterations of this

four loop how many instructions get executed, well we have one here we have a

comparison so we compare A(i) to B(j) and either way the comparison comes up we then

do two more operations, we do an assignment. Here or here. And then we do

an increment of the relevent variable either here or here. So that's gonna be

three operations per iteration. And then maybe I'll also say that in order to

increment K we're gonna call it a fourth iteration. Okay? So for each of these N

iterations of the four loop we're gonna do four operations. All right? So putting it

all together, what do we have is the running time for merge. So let's see the

upshot. So the upshot is that the running time of the merge subroutine, given an

array of M numbers, is at most four M plus two. So a couple of comments. First of

all, I've changed a letter on you so don't get confused. In the previous slide we

were thinking about an input size of N. Here I've just made it. See I've changed

the name of the variable to M. That's gonna be convenient once we think about

merge sort, which is recursing on smaller sub-problems. But it's exactly the same

thing and, and whatever. So an array of M entries does as most four M plus two.

Lines of code. The second thing is, there's some ambiguity in exactly how we

counted lines of code on the previous slide. So maybe you might argue that, you

know, really, each loop iteration should count as two operations, not just

one.'Cause you don't just have to increment K, but you also have to compare

it to the, upper bound of N. Eh, maybe. Would have been 5M+2 instead of 4M+2. So

it turns out these small differences in how you count up. The number of lines of

code executed are not gonna matter, and we'll see why shortly. So, amongst

friends, let's just agree, let's call it 4M plus two operations from merge, to

execute on array on exactly M entries. So, let me abuse our friendship now a little

bit further with an, an inequality which is true, but extremely sloppy. But I promise

it'll make our lives just easier in some future calculations. So rather than 4m+2,

'cause 2's sorta getting on my nerves. Let's just call this. Utmost six N.

Because m is at least one. [sound] Okay, you have to admit it's true, 6MO is at

least 4M plus two. It's very sloppy, these numbers are not anything closer to each

other for M large but, let's just go ahead and be sloppy in the interest of future

simplicity. Okay. Now I don't expect anyone to be impressed with this rather

crude upper bound, the number of lines of code that the merge subroutine needs to

finish, to execute. The key question you recall was how many lines of code does

merge sort require to correctly sort the input array, not just this subroutine. And

in fact, analyzing Merge Sort seems a lot more intimidating, because if it keeps

spawning off these recursive versions of itself. So the number of recursive calls,

the number of things we have to analyze, is blowing up exponentially as we think

about various levels of the recursion. Now, if there's one thing we have going

for us, it's that every time we make a recursive call. It's on a quite a bit

smaller input then what we started with, it's on an array only half the size of the

input array. So there's some kind of tension between on the one hand explosion

of sub problems, a proliferation of sub problems and the fact that successive

subproblems only have to solve smaller and smaller subproblems. And resolute

resolving these two forces is what's going to drive our analysis of Merge Short. So,

the good news is, is I'll be able to show you a complete analysis of exactly how

many lines of code Merge Sort takes. And I'll be able to give you, and, in fact, a

very precise upper bound. And so here's gonna be the claim that we're gonna prove

in the remainder of this lecture. So the claim is that Merge Short never needs than

more than six times N. Times the logarithm of N log base two if you're keeping track

plus an extra six N operations to correctly sort an input array of N

numbers, okay so lets discuss for a second is this good is this a win, knowing that

this is an upper bound of the number of lines of code the merger takes well yes it

is and it shows the benefits of the divide and conquer paradigm. Recall. In the

simpler sorting methods that we briefly discussed like insertion sort, selection

sort, and bubble sort, I claimed that their performance was governed by the

quadratic function of the input size. That is they need a constant times in the

squared number of operations to sort an input array of length N. Merge sort by

contrast needs at most a constant times N times log N, not N squared but N times

log N lines of code to correctly sort an input array. So to get a feel for what

kind of win this is let me just remind you for those of you who are rusty, or for

whatever reason have lived in fear of a logarithm, just exactly what the logarithm

is. Okay? So. The way to think about the logarithm is as follows. So you have the X

axis, where you have N, which is going from one up to infinity. And for

comparison let's think about just the identity function, okay? So, the function

which is just. F(n)=n. Okay, and let's contrast this with a logarithm. So

what is the logorithm? Well, for our purposes, we can just think of a logorithm

as follows, okay? So the log of n, log base 2 of n is, you type the number N

into your calculator, okay? Then you hit divide by two. And then you keep repeating

dividing by two and you count how many times you divide by two until you get a

number that drops below one okay. So if you plug in 32 you got to divide five

times by two to get down to one. Log base two of 32 is five. You put in 1024 you have to

divide by two, ten times till you get down to one. So log base two of 1024 is ten and

so on, okay. So the point is you already see this if a log of a 1000 roughly is

something like ten then the logarithm is much, much smaller than the input.

So graphically, what the logarithm is going to look like is it's going to look like. A

curve becomes very flat very quickly, as N grows large, okay? So F(n) being log

base 2 of n. And I encourage you to do this, perhaps a little bit more

precisely on the computer or a graphing calculator, at home. But log is

running much, much, much slower than the identity function. And as a result,

sorting algorithm which runs in time proportional to n times log n is much,

much faster, especially as n grows large, than a sorting algorithm with a

running time that's a constant times n squared.