In this video we will learn about another visualization tool, the Boxplot and how to create one using matplotlib. So, what is a boxplot? A boxplot is a way of statistically representing the distribution of given data through 5 main dimensions. The first dimension is minimum, which is the smallest number in the sorted data. Its value can be obtained by subtracting 1.5 times the IQR where IQR is interquartile range from the first quartile. The second dimension is first quartile which is 25% of the way through the sorted data. In other words, 1/4 of the data points are less than this value. The third dimension is median, which is the median of the sorted data. The fourth dimension is third quartile, which is 75% of the way through the sorted data. In other words, 3/4 of the data points are less than this value. And the final dimension is maximum, which is the highest number in the sorted data where maximum equals third quartile summed with 1.5 multiplied by IQR. Finally, boxplots also display outliers as individual dots that occur outside the upper and lower extremes. Now let's see how we can create a boxplot with Matplotlib. Before we go over the code to do that. Let's do a quick recap of our data set. Recall that each row represents a country and contains metadata about the country, such as where it is located geographically, and whether it is developing or developed. Each row contains numerical figures of annual immigration from that country to Canada from 1980 to 2013. Now let's process the data frame so that the country name becomes the index of each row. This should make retrieving rows pertaining to specific countries a lot easier. Also, let's add an extra column which represents the cumulative sum of annual immigration from each country from 1980 to 2013. So, for Afghanistan, for example, it is 58,639 total and for Albania it is 15,699. And so on. And let's name our data frame, DF_Canada. So now that we know how our data is stored in the Dataframe DF_Canada, say we're interested in creating a boxplot to visualize immigration from Japan to Canada. As with other tools that we learned so far, we start by importing Matplotlib as MPL and the pyplot interface as PLT. Then we create a new data frame of the data pertaining to Japan and we're excluding the column total using the years variable. Then we transpose the resulting data frame to make it in the correct format to create the Boxplot. Let's name this new data frame DF_Japan. Following that, we call the plot function on DF_Japan and we set kind equals box to generate a boxplot. Then, to complete the figure, we give it a title and we label the vertical axis. Finally, we call the show function to display the figure and there you have it, a boxplot that provides a pleasing distribution of Japanese immigration to Canada from 1980 to 2013. In the lab session we explore boxplots in more detail and learn how to create multiple boxplots as well as horizontal boxplots, so make sure to complete this module's lab session. And with this we conclude our video on boxplots, see you in the next video. (Music)