[MUSIC] In the last few lessons, we looked at two common algorithms, finding the largest item in the list, and searching for an item in a collection using what's known as linear search. Both these algorithms have linear complexity. To find the largest item in the list, you have to look at every item. And for search, in the worst case, the item is not in the list and you need to look at every item to determine that it is not there. A key point in the search algorithm that we discussed is that we assume the items were unordered. But what if the items were ordered? Here's the list of scheduled meeting times that we saw in a previous lesson. And we were trying to figure out whether it contained a particular time slot. Say, Monday at 11am. Notice that the time slots are not ordered. So we have to look at every item in the list to determine that Monday at 11am was not there. Now suppose that we order these time slots, as shown here. Suppose we are again looking to see if Monday at 11am is available. The target is bigger than Monday at 10am but smaller than Monday at 12pm, so we can stop, knowing that the target can not be in the rest of the list, since those time slots all come after Monday at 12pm. Using order, we're able to stop early when the target is not in the list. Remember that before, we had to look at every item in the list to know that the target was not there. So it seems like, using order, we're able to do a lot better. But recall that we're interested in the complexity of the algorithm, which is defined in terms of the worst-case behavior. What is the worst case here? When what we are looking for is greater than or equal to the largest item in the list, which, here, is Friday at 3pm. So if our target is Friday 4pm, we have to compare against every item in the list to determine that it is not there. So the complexity of the algorithm is still linear. Can we do better? Keep in mind that we can only tell the computer to look at one item at a time, and that we're trying to search using as few comparisons as possible. Any algorithm that we use must determine where to look first and then where to look next. We can tell the computer to look at the first item or the last item or any other item using its position in the list. In the case linear search, we told it to start at the first item then to go to the next. Suppose that we were to start elsewhere. Say, the middle of the list. Since the scheduling problem is really that of searching for a value in a list, I'm going to shift the example to a list of numbers as shown on this slide. It's more compact and simpler to understand what something like 4 is less than 7 means. Note that this list of numbers is again sorted, from smallest to largest. And that there are 15 numbers. Now suppose that we're searching for 42, and that instead of starting with the first item, we start in the middle of the list. The middle item shown in green is 31, which is not what we want. However, the target, 42, is bigger than 31, and the list is sorted. So what do we know? We know that the target cannot be in the first half of the list. We can eliminate the first half of the list, shown here by graying the numbers out, and have just saved a lot of comparisons. Let's repeat this process. The middle item of the remaining list is 49. Now the target, 42, is smaller than 49. What do we know now? Again, since the list is sorted, we know that 42 cannot be in the last half of the list. And we can eliminate it. The middle of the remaining list is now 38. Since the target is bigger, we eliminate the first half of the remaining list. The middle item of the list is now 42. The list as a single element, which is what we were looking for, success! Note that this only works because the list is sorted. We call this algorithm binary search, since it eliminates half the items in the remaining list at each iteration of the algorithm. It is a little more complicated than the ones we've seen so far. We start by comparing the value in the middle of the list to the target. There are three options. The middle value is either equal to, greater than, or less than the target. We consider each case. If they are equal, then we have found the target and can stop looking. If the middle is greater than the target, then we can remove the middle item and all items that are larger than it. That is the second half of the list. If the middle item is less than the target, then we can remove the middle item and items that are smaller than it. That is the first half of the list. Now we have a list which is smaller by half. And we repeat with the reduced list. If we remove all the items in the list, then we know that the target is not there. We can also describe the algorithm binary search using a flowchart. We start by taking its input, the list and the target element. If the list is empty, then we can stop and output false. The element is not in the list. Otherwise, we look at the middle item in the list and compare it against the target. If it is the target, then we can stop and output true, we have found the element. Otherwise, if mid is less than the target, then we eliminate mid and the first half of the list. If mid is greater than the target, then we eliminate mid and the second half of the list. We repeat this process until the list is empty. So what is the algorithmic complexity of binary search? To understand the complexity of binary search, let's count the number of comparisons using our first example, where the target is 42. Initially, the number of comparisons is 0. We will go through the steps just described, incrementing the number of comparisons as we go. Compare and increment, compare and increment, compare and increment, compare and increment. Now we have found the target and can stop. The total number of comparisons is 4. And we seem to have saved a lot of comparisons. Observe that if we had started at the beginning of the list and done a linear search, the number of comparisons would have been 11. But for complexity, we are interested in the worst case behavior, which is when the item is not in the list. How many comparisons would it take for binary search in this case? For example, if we were searching for 28, you can see by eyeing the list that 28 is not there. We'll step through the example, incrementing the number of comparisons as we go. So if 31 is larger than our target, we eliminate the last half of the list. The middle item, 12, is smaller than our target, so we eliminate the first half of the list. The middle item 22 is smaller than our target, so we eliminate the first half of the list. The middle element 25 is smaller than are target, so we eliminate the last half of the list and are left with an empty list. So 28 is not in the list. The total number of comparisons is, again, 4. Observe that, using linear search, in the worst case we would compare against every item in the list, which would mean 15 comparisons. So what's the complexity of binary search? We saw that in the worst case it is better than linear search because we don't need to compare our target to every item in the list. But how can we describe its complexity? The way I like to think about it is to consider what would happen if we double the number of items in the list in the worst case. So here, we have a list with twice as many items, there are 30. For clarity, we've kept the original list in the first gray row. In the second gray row is the new list. And the worst case is searching for an item that's not there, for example, 28. In this new list, the middle item is now 67. We compare it with our target of 28. And since the target is less than 67, we eliminate the second half of the list, ie, the entire second gray row disappears in one comparison and we are left to search for the target 28 in the original list. Now remember that it took four comparisons to determine that the target was not in the original list. So the total number of comparisons is 1 plus 4, or 5. In this graph, we show the difference in behavior between linear search shown here in red, versus binary search, shown here in orange. Here, the x-axis is the input size. And the y-axis is the number of comparisons. In linear search, if you double the size of the input, say from 2 to 4, the number of comparisons doubles. But with binary search, after each comparison, you eliminate half the list. So if you double the size of the input, say from 2 to 4, the number of comparisons only grows by 1. This is what is known as logarithmic complexity. The number of steps grows as the logarithm of the input size. Now if you don't remember logarithms, that's okay, just look at the relative growth of these lines and see that that logarithm grows much more slowly than linear. In this lesson, we talked about another search algorithm, called binary search. It is an improvement over linear search, but can only be used when the elements are ordered. Note that it's possible that in some cases linear search will perform fewer comparisons than binary search. For instance, if the element you're looking for happens to be the first in the list. The binary search is better since we care about the worst case performance. Binary search has logarithmic complexity, whereas linear search has linear complexity. Thus, in the worst case, adding more elements doesn't increase the number of comparisons as much in binary search as in linear search.