In the case where the explanatory variable represents more than two groups, a significant ANOVA does not tell us which groups are different from the others. To determine which groups are different from the others, we would need to perform a post hoc test. A post hoc test conducts post hoc paired comparisons. Post hoc means after the fact. And these post hoc paired comparisons must be conducted in a particular way in order to prevent excessive type 1 error. Type 1 error, as you'll recall, occurs when you make an incorrect decision about the null hypothesis. That is, you reject the null hypothesis when the null hypothesis is true. Why can't we just perform multiple ANOVAs? That is, why can't we just subset our observations and take two at a time? That is compare white versus black, white versus American Indian, Alaskan native, etcetera, etcetera, until all the paired comparisons have been made. As you know, we accept significance and reject the null hypothesis at p less than or equal to 0.05. A 5% chance that we're wrong and have committed a type 1 error. There's actually a 5% chance of making a type 1 error for each analysis of variance that we conduct on this question. Therefore, performing multiple tests means that our overall chance of committing type 1 error, could be far greater than 5%. Here's how it works out. Using the formula displayed under this table, you can see that while one test has a Type 1 Error Rate of 0.05, by the time we've conducted ten tests on this question, our chance of rejecting the null hypothesis when the null hypothesis is true is up to 40%. This increase in the Type 1 error rate is called the family-wise error rate and is the error rate for the group of pair comparison. >> Post hoc tests are designed to evaluate the difference between pairs of means while protecting against inflation of Type 1 errors. And there are a lot of post hoc tests to choose from, when it comes to analysis of variance. There's the Sidak and the Holm T test. And Fisher's Least Significant Difference Test. Tukey's Honestly Significant Difference Test. The Scheffe test. The Newman-Keuls test. Dunnett's Multiple Comparison Test. The Duncan Multiple Range Test, and the Bonferroni Procedure. It's enough to make your head swim. >> While there are certainly differences in how conservative each test is in terms of protecting against type one error, in many cases it's far less important which post hoc test you conduct and far more important that you do conduct one. >> In order to conduct post hoc paired comparisons in the context of my ANOVA, examining the association between ethnicity and number of cigarettes smoked per month, I'm going to use the Tukey HSDT, or Honestly Significant Difference Test. To do this, I will first add an import statement for the library statsmodels.stats.multicomp into my python script as multi, the term that I will use to refer to library later on in my program. Next, I will add the following code to the end of my program. I am calling the object that will store my multiple comparisons MC1 and use the multicomparison function from the stats models stats multicomp library, which I have imported as multi above. Next I include in this statement the quantitative response variable and the categorical explanatory variable in parenthesis. Res1 is the name I am giving to the object that will store my post hoc results. Then, I set that equal to my multiple comparisons object, and I request the tukey hsd test. Finally, I ask Python to print these results with the summary function. Here we see a table displaying the Tukey post hoc paired comparisons. That is, differences in smoking quantity for each ethnic group pair. In the first row of the table, we see the comparison between ethnic group one and two. Individuals endorsing white ethnicity versus those endorsing black ethnicity. As well as mean differences in number of cigarettes smoked between these two groups. Python has calculated a P value, though it is not displayed, that takes the multiple comparisons into consideration and protects us from inflating our type 1 error and rejecting the null hypothesis when the null hypothesis is true. In the last column, we can determine which ethnic groups smoke significantly different mean number of cigarettes than the others by identifying the comparisons in which we can reject the null hypothesis, that is, in which reject equals true. So we can see that ethnic group one is significantly different than ethnic groups two, four, and five. And when we again examine group means, we can say that individuals endorsing white ethnicity, group one, smoke significantly more cigarettes per month, than individuals endorsing black, asian and hispanic ethnicity. Groups two, four and five.