Hello, and welcome to this lesson on Naive Bayes.

Naive Bayes is a classification algorithm that employs

the Bayesian algorithm to perform classification.

So this week, in this lesson,

I want you to understand what the Naive Bayes algorithm is.

You should be able to articulate how it can be used to perform classification and

understand and discuss some of the nuances of the algorithm and we'll go through those.

And also, you need to be able to apply

this particular algorithm by using the scikit learn library.

There are two main activities for this particular lesson.

First is you can read a nice discussion of the Naive Bayes algorithm by Jake VanderPlas.

You can read up to the Example: Classifying Text section, which is optional.

And then of course our course notebook where I introduce Naive Bayes algorithm,

provide a little bit of background and then demonstrate how it can be

used on a simple data set and a more complex data set.

So first, I just wanted to show you quickly the reading from Jake Vanderplas,

The Naive Bayes Classification.

This talks a little bit about this algorithm.

One of the reasons that people like to use it is it gives reasonable performance.

It's very fast and very simple and easy to

understand what it's actually doing and so that makes it

a great algorithm to try out as one of the first things you

do when you're approaching a classification problem.

Particularly, for data sets that

demonstrate some sort of independence among the features.

So, it talks here about Bayes algorithm and the classification,

how it can be applied,

different forms of Naive Bayes algorithm, etc.

Now, I want to switch over to our notebook.

In our notebook, we introduce the Naive Bayes classifier.

The main things that we do is introduce some of

the basic concepts that underlie this algorithm and how it can be applied,

demonstrate how it can be used for classification as

well as a demonstration on a more complex data set the German credit data.

So first, we look at the formalism,

and to understand the Naive Bayes algorithm,

you have to understand conditional probability and how we can calculate that.

So, one way to look at things is to say, well,

what's the probability that we have the features we

observed for a particular instance given its classification.

And in doing so, we can realize that that's

actually the product of all the individual features given that classification.

Now, we can invert this with Bayes theorem to say, well,

what's the probability of a classification given

the features in terms of the probability of that classification?

Which is the prior probability.

The probability of the features given the classification,

that's the likelihood that we want to understand and

that's that normalized by the probability given all the data.

So, this section walks through calculating conditional probabilities, what they mean.

We use in this case the tips data set.

We create a pivot table to show the different ratios,

to be able to compute the different ratios of a given cell along a column or a row.

These are the conditional probabilities,

and so this example actually walks through doing that in specific cases.

So, what's the probability that it's lunch given Friday?

What's the probability that relates the probability of that

being lunch and the probability of it being Friday?

And then we can compute that relative probability.

We can then use that in the Naive Bayes algorithm to actually

compute what's the classification given the features that we see.

So, we walk through that using

an example of conditional probabilities in a binomial process.

We talk about how this all works and then we

actually get into the Naive Bayes classification algorithm.

In this case, the three different ways can be implemented

depend on how the features actually are going to be distributed.

If we think the features are following a binomial distribution,

we do Bernoulli Naive Bayes.

A Multinomial distribution, we would do

Multinomial Naive Bayes and

a normal distribution would of course be Gaussian Naive Bayes.

The rest of this notebook just walks through this demonstrating first on

the Iris data and then moving on to the German classification data.

You've seen many of these things before.

So I don't want to go through them in too much detail.

Basically, just showing that it's very easy to apply Gaussian Naive Bayes or any of

the Naive Bayes algorithms and demonstrates

rather high classification accuracy very quickly and easily.

So with that, I'll go ahead and stop this particular lesson.

If you have any questions,

let us know in the course forums and as always, good luck.