As you might have noticed tattoos are increasingly popular among football players. The so-called tattoo sleeve in particular is rising on the football fields. A tattoo sleeve is what the name suggests, a sleeve of tattoos. You are interested in the question to what extent football players have covered their bodies with tattoos? Imagine two football teams. What you see here, are dot plots representing the distribution of the variable percentage of body covered with tattoos in these two teams. The horizontal line represents this variable. And the dots stand for the 11 individuals in each team. The players of team one have covered about 10 to 20% of their bodies with tattoos. In the second team, the players differ much more from each other in terms of their tattoo density. The percentage ranges from 0 to about 30%. Thus, this team strongly differs from each other. However, mode, median and mean are the same. In both distributions the mode = 14.1. And median and mean = 15. This indicates that in order to adequately describe a distribution we need more information than the measures of central tendency. In this video, I will show you that we also need to have information about the variability, or dispersion of the data. I will discuss two measures of variability: The range and the interquartile range. I will also discuss the so-called boxplot. It's a very useful graph that gives a good indication of how the values in the distribution are spread out. The most simple measure of variability is the range. It is the difference between the highest and lowest value. Let's look at our two teams again. The player in team one with the largest of tattoo density has covered 19.3% of his body with tattoos. The player with the smallest tattoo density has covered 10.8% of his body. The range is 19.3- 10.8 = 8.5. In team two, the player with the largest tattoo density has covered his body for 27.7% with tattoos. And the play with the smallest density, for 0%. The range is therefor 27.7 = 0 is 27.7. The range that shows you at a glance that there is much more variability in team 2 than in team 1. The range is a measure of variability that is easy to understand and simple to compute. However, in many cases it doesn't give a good impression of the variability of the data. The reason is that it only takes into account the extreme values. Look at these two distributions. They have the same range, but you can see immediately that the variability in the second distribution is very different from the variability in the first graph. Another measure of variability, the interquartile range, is a better measure of dispersion because it leaves out the extreme values. It basically divides your distribution in four equal parts. So, if your distribution looks like this, you divide the scores in such a way that the 25% of your lowest scores are below this value. And if 25% of your highest scores are above this value. We also have 25% of our scores here. And 25% of our scores here. The values that now divide the distribution are called quartiles. This is the first quartile. This is the second quartile. And this is the third quartile. As you can see, the second quartile divides the distribution in two equal parts. After all, 50% of the values is below this value and 50% lies above the value. Q2 is therefore the same as the median. The interquartile range is the distance between the third and the first quartile, or in other words IQR = Q3- Q1. Let me show you how to compute it by going back to the tattoo density example. This is what the distribution of team 2 looked like. First, you look for the median or in other words Q2. That's easy. It's the middle value. That's 15. You find Q1 by looking for the middle value of the values on the left side of the median. That's here. 8.7. You find Q3 by following the same strategy on the right side of the median. That's 19.3. Now, the interquartile range is Q3- Q1. Equals 19.3- 8.7 = 10.6. The main advantage of the IQR is that it is not affected by outliers because it doesn't take into account observations below Q1 or above Q3. Yet it might still be useful to look for possible outliers in your study. As a rule of thumb, observations can be qualified as outliers if they lie more than 1.5 IQR below the first quartile, or 1.5 IQR above the third quartile. There is one specific type of graph that is very useful when it comes to describing center and variability, and detecting outliers. That graph is a so called box plot. The box plot shows you at a glance Q1, Q2 and Q3. The minimum value that's not an outlier, the maximum value that's not an outlier and the outliers. This is a box plot based on the previous example. The box itself stands for the central 50% of the distribution. It goes in other words from the Q1 to the Q3. The length of the box represents the IQR. The horizontal line inside the box is the median or in other words, Q2. These lines are called whiskers. They contain the other values except for the outliers which are displayed separately by means of dots. There are no dots here, so this box plot shows us that we don't have any outliers. How do you decide how long the whiskers should be? Well, let's go back to the values in our example. We have detected Q2, Q1, and Q3. And the IQR. We know that values below 1.5 times the IQR below Q1 and above 1.5 times the IQR above Q3 are outliers. Our IQR is 10.6. So 1.5 times 10.6 equals 15.9. Q1 is 8.7 so all values lower than 8.7- 15.9 = -7.2 are outliers. Such values don't exist so we have no outliers on this side. Our minimum value is 0. That's the end of the whisker. Q3 is 19.3. So, all values higher than 19.3 + 15.9 equals 35.2 are outliers. We don't have values this high, so we don't have outliers on this side either. The end of the upper whisker therefore is equal to the maximum value, which is 27.7. So, Let's also take a look at the box plot of team 1. If we compare the two box plots, we see immediately that the variability within the two distributions differ strongly. So remember, the center of a distribution only tells you one part of the story. For a more complete picture also assess the variability of a distribution. A box plot shows important aspects of a distribution in a compact way, using three quartiles, the outliers, and the range of the data after removing the outliers.