Now that parent's a legal and there's nothing more to be done.
We, have a legal three tree, 2-3 tree. Insert X,
It's bigger than R, Goes to the right.
There's a 2-node, there's room for the X. Insert P,
That goes between E and R. The 2-node containing H,
Right link is null, So we convert that 2-node into a 3-node
and now we have a legal 2-3 tree. Now, you, you can see this next insertion
is going to cause some splitting wherever it is.
So insert L, That's between E and R.
So it goes in the, 3-node containing H and P and we convert that into a temporary
4-node. Split that 4-node, moving L to the parent.
Now that parents of 4-node and that has to be split,
And we create a new root node. And then the height of the tree grows by
one. And that's a legal 2-3 tree, so we stop.
So, those local transformations, converting a 2-node to a 3-node or
converting a three to a four, and then splitting and passing a node up.
Those are the, only operations we need to consider to get balance in our search
trees. Alright.
So as I've mentioned and this diagram shows, the splitting of 4-node and a 2-3
tree is a local transformation. It only involves changing a constant
number of links. So, in this example, it shows the general
situation, when the 4-node to be split is the middle length,
But the same is true if it's a left or right.
And those six subtrees drawn could be huge.
They could contain millions of keys, But it doesn't matter what they contain.
We don't touch them at all, Nor do we touch anything above this node
in the tree until the split happens. So the transformation that splits that b,
c, d, node and inserts the C into the 3-node at the root, just involves, making
that 3-node into a temporary 4-node. And making that, 4-node into two 2-nodes
and adjusting the lengths appropriately. Just a constant number of operations and
that's why, this operation, is, in general, efficient. so let's look at the
just the global properties that these manipulations preserve.
The two things that are critical is that the, in a, in a 2-3 tree, we always have
symmetric order. That is the word that we defined, for
2-nodes and 3-nodes, and we also have the perfect balance.
The distance from the root to the bottom is always the same.
And to prove that, we just need to show that each transformation maintains
symmetric order and perfect balance, and these are all the possible transformations
that we could do. If we split the root, then, that's the,
what happens at the root, and if there was perfect balance before, there's perfect
balance after, with the height one bigger. If the parent was a 2-node then the
transformation is a local transformation and if you look at where the links are,
then it's easy to see by induction that if there was perfect balance before there's
perfect balance afterward, Because we didn't change anything about
the perfect balance in any of those subtrees.
And that's true in every case. If the 3-nodes at the right and this one
is one higher and those four are one lower and afterwards it's the same.
If there was perfect balance before there's perfect balance afterwards,
because we didn't change the height of any nodes.
We just moved things around locally within nodes.
And this is when this parent is a 3-node, then there's the tree cases,
If we split up the last split at the middle and split at the right,
And again, changing the four node to, to a 2-nodes and adding links.
If there was perfect balance before, there's perfect balance after, because we
didn't change the heights of anything else in the tree.
So our operations maintain symmetric order and perfect balance in a 2-3 tree.
So, that's going to give us, a very easy way to describe a performance.
The call, or operations have costs that's proportional to the path link from the,
height to the bottom, and every path from the root to a null link has the same
length. How long can those paths be?
Well, it's not hard to see that the, in the worst case, if they're all 2-nodes,
that's the longest they can be is log base two of N.
Now, and if they're all 3-nodes, it would be log base three of N, which is less,
It's about 0.63 log base two of N. So, all the paths in a 2-3 tree with N
nodes have to have length between those two bounds and those are pretty small
numbers. For a million nodes, that's between twelve
and twenty. And, if, if it's a billion nodes, that's
between eighteen and 30. Those are remarkably small numbers, so
we're going to have guaranteed performance, even for huge databases,
We're going to be able to guarantee that we can get search and insert them with
just eighteen to 30 operations and it's quite remarkable, really.
So, here's what our table, will look like, when we finish the implementation of 2-3
trees. Every operation is guaranteed to be a
constant times log n. Now, the constant depends on the
implementation, exactly what kind of manipulations we need to do to convert,