0:16

Well, one thing we could do is maintain a linked list.

We could keep it in order or keep it unordered.

This version keeps it unordered.

So we're going to have nodes in the linked list that have key value pairs.

They have every key in the symbol table and a value associated with that key.

For search, we have to, since it's unordered,

scan through the whole list to find a match, a key that's there.

For insert, I would also have to scan through all keys to find

the place to update a value, if it's a value that's already there.

And if there's no match, we could add it to the front.

So here's our simple client for traces,

so if we associate S with zero, we just had it.

That's our one node linked list that's got that information.

Associate E with 1, that's not there, so we just add it to the beginning

of list A with 2 R, with 3C, with 4 H, with 5 and so forth.

So now when we associate E with 6,

we have to search through the list to see if there's an E.

1:35

It's possible to implement symbol tables that allow multiple values with

the same key and so forth.

And that leads to different types of clients,

different types of implementations.

We're going to stick to this associative array abstraction and

no duplicate keys in the symbol table,

because it both simplifies implementations and leads to simpler client code.

1:58

Okay, X7 is a new value.

A8, we found A in there and update the value 8.

And then, M9 and P10, L11 are all not there and they go at the beginning.

And then the last one changes the value at E, again, 12.

So implementing this is a simple example

of linked list processing, a slight modification of our stack and queue code.

2:49

And if everything's random,

then on average you only have to look halfway through for a successful search.

Well, you still have to insert.

Another issue is, for many clients, if the keys are ordered,

it's nice to be able to iterate through the symbol table and order.

And this one, by definition, doesn't provide that.

And this one just uses equals, so the keys don't have to be comparable for this.

It just uses equals.

So our challenge is to look for methods that

give us more efficient implementations for these search and insert operations.

And we've already looked at an algorithm that can do this, and

that's binary search.

3:58

And then find the index associated with the key that we're searching for

using binary search.

And then use that index to get the value that's associated with that key,

that's stored in a parallel array.

And we looked at the binary search algorithm earlier in the course.

And so, for example, if these are the keys in our symbol table and

we're doing a search for the index where P is stored, we look at the middle.

P is bigger than L, so we look to the right.

Look in the middle of the right half.

P is less than R, so we look to the left.

Continue until we find P.

When we find P, we return its index and

we use that index to get us the value that we need.

Or another way to look at this is it implements the function,

how many keys are there that are less than K.

So, for example, for Q, that's an unsuccessful search.

You can figure out from the last index

when you don't find your element that you're seeking.

You can figure out the return value which is the number of

keys that are less than it.

So that's a trace of implementing binary search to find the rank of a key in

ordered array.

And again, for successful, you can use that rank to return the value.

And if it's unsuccessful,

you can use that rank to figure out where to insert the new key.

[COUGH] All right, so this is the code for

the get operation and this rank which is binary search.

So this is precisely the binary search code that we looked at before.

So let's look at the gap.

So if the whole table is empty, return null.

Otherwise, we call rank and

that gives us a number of keys less than the current key.

And so, that is where we look to check to see if that key is there.

If it's there, then we return the value with the same index and parallel array.

6:20

Now, the problem with binary search is, well, not necessarily the problem,

but the situation is that when it's time to insert a new element,

we have to move everything larger over one position, just like an insertion sort.

So if the table has A, E, R, and S, and we have to insert the value C,

then we have to move the E, R, and S over one position to put the C.

And then put the value associated with C.

You'll have to do the same thing in the vals array.

Move all the values associated with those keys over in one position and

put the associated value in.

So this is a trace of what would happen for our trace.

And again, every insertion involves

making a new position by moving all the larger keys over one position.

Do the same thing in the vals array.

[COUGH] And if it's changing the value

associated with a key that's already there, then it's just a matter of

finding where the key is and changing the value at that index.

So from that trace,

it's pretty easy to see what's involved for the code.

And we'll skip that code and

just take a look at the comparison between this elementary implementation for

symbol tables with the sequential search in an unordered list.

So, one thing is we're using a different key interface.

We're taking advantage of the fact that the keys are comparable

to give us an efficient search.

We can do search in worst case, in average case, in time proportional to log N.

That's what binary search provides for us.

And this is a fine data structure for symbol tables where there is,

[COUGH] that are relatively static,

where the values don't change much, and most of the operations are search.

It's hard to beat binary search.

On the other hand, in a dynamic situation where there are a lot of inserts,

this method is going to be problematic, because the cost of its insert is linear.

It's proportional to N over 2.

If you have a huge number of operations and

every one of them is proportional to the symbol table size,

then you're just not going to be able to support huge numbers of keys.

What we want is sufficient implementations of both search and insert.