>> So here we are continuing to work down our code. You can see that we have a new data file to read in, press abc dot csv, so now there are three response categories, preferences for A, B and C, and we'll follow a similar pattern that we did before. So, having read in the file, we'll view it, and you can see that we have subject in the left, and preferences in the right, and now there are not just As and Bs, but also Cs, so there's a new website alternative introduced. From a study design point of view, we'd have to think about if that was introduced at the end of the subject seeing the other two alternatives. Might that not cause an un order affect because it always came last? For our purposes maybe we assume were testing over a new set of 60 participants, and so they haven't seen the other two before. Go ahead and close that. We're going to recode the subject column for good practice again as a categorical factor. And we can take as well a summary of the data. And we can see now that our summary has eight preferences for website A, even less than before. 21 for B and 31 for C, in a sense we might think that C came in and sort of siphoned off preference from A and B and became possibly the new popular one. But the statistical question is, is 31 preferences for C is statistically significant? With respect to 21 for B. Obviously, they're both quite a bit more than A and that might be meaningful enough to us just on the face of it. So again, we create a cross tabulation, a table that tells us the preferences. These tables get more interesting when When we have more than one sample. So we can see again a b and c are in our table there and then we run a chi square test again. We can see here the result is now has two degrees of freedom, not just one. Because we have three response categories and so the degrees of freedom for a chi squared test of this kind are the number of response categories minus one. We can also see our P value is indeed significantly less than 0.05. Here I've added the chi squared test result for our three response category data. Press A, B, C. We can see we have kai squared as I mentioned we have two degrees of freedom now. We have, still, 60 measures, 60 cases. The chi-squared statistic is 13.30, and it gives us a p value that's less than 0.01. So that's how we'd report that second result. And you can extrapolate from here for reporting all the chi squared tests that you may do. So that's' our asymptotic test. The multinomial test that we mentioned is an exact test and for that we're going to need to load in the x nomial library. Let's do that. And the x multifunction gives us a test over a set of probabilities here. We're testing over the preferences we have. The c function just creates a list. So we can pass in a list of values that are the probabilities if there were no preference. And that would be a third of respondents would be for each category A, B, and C. If there was no preference for any of the websites, we'd expect a third of the people to like each of them, and then this function has various ways to calculate the probability and that comes through stat name. Incidentally, if you'd like to look up information about any function, a very useful thing in our studio is to type a question mark and then the function you're interested in. So let's say we didn't know what that stat name parameter meant and we wanted to see more we could type question mark x multi. And we can see here, I know that on your screen the font may be pretty small, but it's a help page that gives us all the information about this function and its parameters. So that's a great way to always learn more. You'll find that you're doing that often, I certainly am. So, let's go ahead and execute the x multifunction, the multinomial distribution test, multinomial test, and we can see it gives us a p value, that's an exact p value. Also quite a bit less than 0.05. So, again, we'd expect a difference, given those proportions. Now, what that tells us, is there is a difference between some levels of some numbers, of A, B, and C. It doesn't actually tell us what the difference is between A and B, B and C, or A and C. Those pair wise differences are what are called post hoc tests or post hoc pair wise comparisons. They're post hoc in the sense that they follow a statistically significant overall, or what's called omnibus test. We just did that omnibus test, with the multi nomial test, with x multi. If we want to know about the comparison of the separate differences, well then we can go ahead and run post hoc binomial tests, binomial again in sense that now we're just back to testing each of them against a hypothesized probability. So for example, we can see in this line a test of level A against a hypothesized probability of one third. So if we had what this test is showing is, we were summing over the rows that have a preference for A, comparing those against all of the rows in the table. And comparing that against what would be a chance probability of a third. And we're doing that for all A, B, and C levels, to see which ones are significantly different from chance. So, we're going to store all those results in AA, BB, and CC, and then we're going to go ahead and do what's called an adjustment. And report the results. Now this adjustment needs a little explanation. When we're testing statistical tests, and we say that if p is less than 0.05, there's a significant result, that means that there's a 1 in 20 chance that just by chance, we might think there's a statistically significant outcome, when in fact it's just by chance. That's what being at 0.05 means, a 1 in 20 chance for that. And so if we're doing what are multiple comparisons, if we were to do 20 of them, we'd expect 1 just by chance to become significant. So we have to adjust for that. And we do that with a Bonferroni correction. This method here is called home, which is really the preferred one. It subsumes Bonferroni, but it's a little bit less of a strict test. It's a sequential test that adjusts each of the p values according to just how low it is. The first p value is in this case with three test multiplied by three, so increased three times. The second is doubled and the highest one is left as is. If at any point one of those is not less than 0.05 then that sequence stops. This is called Holm's sequential Bonferroni procedure. So the bottom line is any time we're doing multiple tests we want to correct in this fashion. And we can see now that we have a significant result for A, it's significantly different from chance from a third. People didn't like it, it had only eight preferences. The preference for B was 21, which was near chance, and then the preference for C was 31, which was is more than half the participants. Of course, 21 is almost a third of 60. So that tells us if they're significantly different from chance for those individual ones, and we see A was lower and C was higher. OK, let's go ahead and go back now to our analysis table to see what we've covered. In that second row, we've covered one sample test with two response greater than two response categories, in this case, three. We saw both a one sample chi square test again and the multinomial test with those pairwise contrasts after the fact. Now we can ask, what happens if we have more than one sample? What happens if we have a couple ways in which our participants are sampled, maybe for their preferences but also say for their sex, male or female? We could ask, is there a difference in what males prefer to what females prefer with respect to these website preferences that we're testing.