We just learned that Boyer-Moore preprocesses the pattern P in order to
build lookup tables that help it use the bad character and good suffix rules.
Let's focus on this idea of preprocessing in some more detail because
preprocessing is an idea that's going to send us down a somewhat
different path in the coming lectures.
So, the naive exact matching algorithm did not use preprocessing.
Right? If you give me a pattern P and a text T,
I can run the exact matching algorithm, but if you give me a pattern P today but
no text, you say I'm going to give you the text tomorrow, there's nothing for
me to do until you eventually do give me the text.
With Boyer-Moore on the other hand, we can preprocess the pattern P.
Boyer-Moore is an algorithm that preprocesses the pattern P,
so if I'm Boyer-Moore and
you give me the pattern P but not the text T, there's something I can do.
I can build the lookup tables for the bad character and
the good suffix rules, then later on, when you give me the text T,
I can use those lookup tables to execute Boyer-Moore.
In fact, if we're in a situation where we're going to be matching the same
pattern against many different texts, so the P is going to stay the same but
the T is going to change, then I can reuse my preprocessing for the pattern.
I can reuse the lookup tables for the bad character and
the good suffix rules over and over again for each new text that you give me.
I don't have to re-preprocess the pattern each time.
So the cost of preprocessing is something that can be amortized over many problems.
It might be costly to do once, but because we use it again and
again, we essentially make up for that cost over time.