Next to summarizing a distribution by means of graphs. It can also be useful to describe the center of your distribution. There are three main ways in which you can do that. By means of the mode, the median and by means of the mean. These three m's are often referred to as measures of central tendency. Finding the mode is easy, it is the value that occurs most frequently. It is, in other words, the most common outcome. The mode is often used as a measure of central tendency if a variable is measured on a nominal or ordinal level. In this pie chart, you can see which continent players in the main Spanish football competition come from. The pie chart makes immediately clear what the mode is, it is Europe. 70% of the players was born in Europe. Know that the mode here is Europe, which is the name of the category that appears most often. The mode is not 70%, that's just the percentage of observations that fall in that specific category. You can also have more than one mode. Imagine that there exists a football player that strongly divides football fans. Some people find him very sympathetic, while others find him strongly unsympathetic. Let's name this player, Franco Galton. Imagine you have asked the representative sample of the Spanish population of 500 respondents, what they think of Franco Galton? Your respondents could indicate on a scale from 0 to 10, how sympathetic they think he is. 0 refers to very unsympathetic, and 10 refers to very sympathetic. Let's say that this is the shape of the histogram resulting from this study. You can see that the Spanish population is strongly divided. Some find Galton very unsympathetic, and some find him very sympathetic. As you can see the distribution has two modes, 3 and 8. This is clearly a bi-modal distribution. The second measure of central tendency is the median. The median is nothing more than the middle value of your observations when they are ordered from the smallest to the largest. Imagine you have also asked seven of your respondents what they think of another famous football player named Tomas Bayez. Let's assume that this is the data matrix of his study. The mode here is 8, the value that occurs most often. To compute a median, we first have to order all values from low to high. This is the result. Then we have to pick the middle value. So, the median is 8. It is slightly more complicated if we have an even number of cases instead of an odd number of cases. Imagine we haven't asked 7, but 8 people what they think of Tomas Bayez. This is the data matrix. And this is the order of the values from low to high. However, in this case, there is no single middle value. How do we solve that problem? Well, we just take the average of the two middle values. That 7 and 8, dived by 2 equals 7.5. The median in this case is 7.5. Notice that the median divides the distribution into two equal parts. 50% of the values lies below the median. And 50% above the median. The third measure of central tendency, is the most often used one and also the one, you most probably already know quite well. It's the mean. The mean is the sum of all the values divided by the number of observations. This is the formula with which you can compute the mean, it looks more complicated than it is. The formula tells you that the mean of variable x symbolized as x bar, equals the sum of all the values of x divided by the sample size, which is symbolized by n. To give an example, let's again use the study on Tomas Bayez. This was the data matrix. The formula tells us to first sum all the values. That's 6 plus 7, plus 7, plus 8, plus 8, plus 8, plus 9. That equals 53. We now have resolved this part of the formula. We now have to divide by n. The sample size in this study is 7, so 53 divided by 7 = 7.6. The mean is 7.6. You can think of the mean as the balance point of your data. Imagine we would place weights on a balance. One for each observation. Then the mean is the point on the balance where the total weight on the one side exactly equals the weight on the other side. You're quite familiar now with the three m's. And you can easily compute the middle of a group of scores in various ways. But when should you report which measure of central tendency? That partially depends on the the measurement level of your variable. If it's nominal, it is impossible to compute the median, or the mean. Think about it, you cannot apply numerical operations on nominal variables, nor can you order them. The only appropriate measure of central tendency, when a variable is nominal, is the mode. But what to do in case of a quantitative variable? Imagine you're sitting in a canteen of a football club in your hometown and you would like to compute the mean and median income of all persons present. That's you, 5 other guests and the bartender. This is the data matrix, the mean is around 35,000. The median is exactly 35,000. Their pretty close to each other, and it doesn't matter which one you use to describe the center of your distribution. But now image the famous football player Franco Galton walks into the canteen. Say he gets about 70 million per year, the median increases slightly to 36,000. The mean however becomes more than 8 million now. We say that Franco Galton is an outlier in this distribution. He earns much more than all the other people present and his income exerts a disproportional effect on the mean income. In this case, it might be argued that it makes more sense to compute a median than the mean to describe the center of the distribution. Let me briefly summarize what you've learned in this video. To describe the center of a distribution you can use three measures of center tendency, the mode, the median, and the mean. If your variable is categorical, you use the mode, and if it's quantitative, you employ the median or the mean. Go for the median if you have influential outliers or if the distribution is highly skewed, and if that's not the case, go for the mean.