Hi everyone, in this video, we are going to go deeper into supervised learning problems. About how supervised learning problem looks like from the previous videos. In this video, we will learn the types of supervised learning problems, and see what difficulties you might face when you work on such kind of tasks. Let me quickly remind you of supervised learning problems. Binary classification, multi-class classification, regression, and ranking. Binary classification is one of the most popular problems in supervised learning. This classification problem might seem very simple If you have a data set similar to the one you see in the example, definitely the task isn't that complicated. You have entries from two different class, and you have to create some rules which will help you to distinguish between the objects from different class. And this example is pretty intuitive, you can see that the classes are linearly separable. So the only thing you had to do is to find the appropriate line to separate the objects. Such an algorithm family is called linear, the decision linearly depends on your future. Thus, when the right line is found, your decision rule is pretty simple. Say, object above the line is derived from the first class, and objects below the line are derived from the second one. This is a nice example, but in most cases, classification problem might look much more complicated. For example, if you have to solve the following problem, it will be not enough just to draw the line. Intuitively, you guessed that the linear model just isn't suitable for this data set structure, and we should consider another algorithm family. When you work with two or three-dimensional data set, data set where each object is represented by a two or three-dimensional vector. It's usually pretty simple to visualize your data, and decide what algorithm family suits you more. When you work with higher-dimensional data sets, the problem became more interesting. There are a lot of nice algorithm families that allows you to solve complicated classification problem. You will deal with them in a second lesson, and we are moving on to the multi-classification problem. Multi-class classification problem is kind of the problem. Where you have more than two class of objects in your data set. This problem seems to be more complicated compared to the binary classification, but good news for you. Multi-classification problem can be reduced to the binary classification one. There are different strategies to build a multi class classification model based on the binary classification. Let's consider one versus all and one versus one strategies. According to the one versus all strategy, you have to create as many binary models, as many classes as you have in your data sets. Each model should be trained to distinguish one specific class from the rest. This decision rule should be applied over this binary model result, and so the final answer depends on the whole set of models. According to the one versus one strategy, you train a model for each unique pair of class. Thus, each model is binary, and it tells you whether your object looks more like a object or more like b object. And the final decision also depends on the full set of binary models. The next type of problem is called regression, here, as you remember, the answer is not a class, but the real number. For example, age, exchange rate, or probability of something, often. As we have discussed earlier, the choice of algorithm family is a very important choice. And in some cases, it will trigger the quality of your model, let me demonstrate it by the following example. Assume we work with this data set, and we need to train the model to fit the data. If you assume that this dependency is linear, you will get the following picture. The model is not so bad, but definitely the dependency is a bit more complicated than it is linear. So our model is too simple to restore this dependency, this is an example of the underfitting. We will get back to the underfitting problem later on in the second lesson. Okay, let's try to use more complicated model, for example, quadratic. Subject to the picture, looks better now, we cannot make any conclusions or the proper estimations, but still, we can see that this model restores the dependency much better. Should we pursue the complication of the model for getting better result? Well, let's try, if we built a even more complicated model, we could get a picture like this. Definitely, this model is too wavy, and it doesn't really reflect our data, this is how overfitting might look like. The model overfits the training set, and it loses the generalization power, overfitting will be our topic of discussion later on. So the last type of supervised problem is called ranking, here, you should find the optimal rank for our objects. And you should remember that, in comparison to the regression problem, here the exact rank is much less important than the order of the objects. For working through this problem, you also had a different approach, the first approach is called pointwise. Therefore, each object in the training data set have an exact rank. Say this is a numerical rank, like 1, minus 2, or 15, or an ordinal rank, like first, second, third. And the model here should optimize to my exact rank of each object. So in the pointwise approach, the ranking problem is approximated by the regression problem. The next approach is called pairwise, here, we can reduce the ranking problem to a classification one, and you do it according to the following scenario. For each pair of the objects, you should make a binary decision, if the first object is better than the second, or vice versa. So for this, you can use the binary classifier, which tells you which object from the selected pair has a higher rank, based on such comparisons, you then make the final rank. And the last approach is called listwise, here, you directly optimize the value of pointwise or pairwise measures, and average them over all queries in the training data. In this video, you have learned the problem types of the supervised learning. You have learned how to solve multi-class classification problems, with the help of the binary classification. You have got the introduction of underfitting and overfitting problems. And you have learned a different approach to work with the ranking problem. Hope you enjoy it, see you in the next video, where we will discuss the approach and the difficulties of the unsupervised learning.