So [NOISE] it's not just enough to have a testing procedure. we'd also like to have some sort of confidence interval. So, let's let pi j hat be the sample proportions. And imagine if we want to estimate d equal to the difference in the marginal proportions. So in this case this would be the difference in the marginal probability of an approve vote. so so then, this is equal to n 1 2 minus n 2 1 over n. So that estimates the difference in the marginal proportions. so we talked in the previous slide about the variance of this estimator, about the variance of this estimator, under the null hypothesis. Let's talk about the variance of the estimator in general, and the variance works out to be this format. This form, pi 1 plus 1 minus pi 1 plus plus pi plus 1 1 minus pi plus 1. So that's the you know, divided by n, that would be the kind of difference in binomial type variance that you would expect to see. And because the the, the samples are correlated. We have this correlation term, minus twice pi 1 1 pi 2 2 minus pi 1 2 pi 2 1. Okay? And so that's subtracting out the correlation here. And what would happen you know If, if basically there's a lot of counts in these off-diagonal cells, pi 1 2 and pi 2 1, right? Then pi 1 2 and pi 2 1, pi 1 2 times pi 2 1 would be a big number. We have minus twice that big number, which would result in a larger variance. if, if the off-idiota cells are really small, and most of the data lie on the main diagonal. then pi 1 2 at times pi 2 2 would be very large, and we'd have minus twice that number. And we'd wind up with a much smaller variance, than the standard kind of difference in binomials variance. Okay? so we could take d minus the true difference in proportions divided by the standard error estimate here. And that follows an asymptotic normal distribution. and we can use that again to create confidence intervals. I think, I hope everyone at this point in the class, could do something like that. So this last bullet point here, I say compare sigma d to what we would use if the proportions were independent. So compare the result to if, instead of asking the same people on two occasions whether or not they approve. What if we asked different set of people each time? Then this minus twice part would go away. Okay? But what, what do we kind of think? We kind of think that people who approve on the first occasion, would be more likely to approve on the second occasion. You might think if you are in the U.S, if you're, If you're a democrat, you might, you know, approve of, say, President Obama. On, on a first question, you'd be more likely to approve on the second question, on the second time point. And the same thing with the people who disapprove. If you're a republican, and you disapproved on the first time point, you, you'd be more likely to disapprove the second time point. So and that follows, you know, that's a very frequent form of correlation. where the measurements tend to be concordant, they tend to agree. so that is exactly this case, where pi 1 1 times pi 2 2. will be much larger than pi 12 times pi 21. In other words, things will tend to lie on the main diagonal of that 2 by 2 table, of the matched 2 by 2 table in that people will tend to agree. And so if that's the case, this covariance term here will be positive, so we'll have minus twice this positive number. And, and you'll, you'll get a dramatic reduction in the variance. So in other words failing to account for the fact that the same people were asked twice. In, in this case would be a a really kind of dumb thing to do. Because you have a reduction, reduction, you'd have a reduction in precision, you get a much wider confidence interval if you, if you fail to do that. So it gets, it's interesting in general. But even if it, even if it resulted in a, in a wider interval to account for the dependency. You'd still want to do it, because that will give you the correct interval rather than one that's based on completely incorrect assumptions.