So, step four in an outbreak investigation is performing a risk factor study. So, we're going to talk about this in two different parts. In part one, we'll talk about a Risk Factor Study in a Defined Population. So, the basic idea of a risk factor study is we want to identify exposures that are associated with disease. Because these exposures may indicate the source of disease or activities that aid in its transmission. So, a simple measure of association and a risk factor study is relative risk or RR as it sometimes denoted. The relative risk is the attack rate in the people who have had some exposure versus the attack rate in people who did not have exposure. If this relative risk is much greater than one, then the exposure is highly associated with the disease. So, to calculate an attack rate, we divide the number of people in a group who developed the disease by the total number of people in that group. Groups can be defined by individual characteristics. So, we might define attack rate by the number of people of a particular age who developed the disease divided by all of the people who are that age. When we're doing risk factor studies, we're often doing this based on exposure. So, we might compare the number of people who got sick after drinking milk at lunch versus the total number of people who drink milk at lunch. Likewise, we might characterize the group at risk by location or other factors. Here's an example from the monkeypox outbreak and prairie dogs that we've been going back to throughout this module. So, if we look on the right, we have a table of prairie dog exposure and clinical monkeypox around childcare attendees. So, this table shows whether or not people were exposed to prairie dogs and among people who were and weren't exposed, the number who got sick, the number who weren't sick, and the total number. So, now we want to calculate some attack rates. So, the attack rate in exposed children are the number of children who were exposed to prairie dogs who got sick, two, divided by the total number of children exposed to prairie dog, six. So, the attack rate is around 33 percent. The attack rate in unexposed children is the number of children who got sick who were not exposed to prairie dogs, two, divided by the total number of children who were not exposed to prairie dogs, 12. So, that's 16.7 percent. So, we divide these attack rates to calculate the relative risk. So, 33 percent divided by 16 percent is roughly two. So, children who were exposed to prairie dogs had a relative risk of two of developing monkeypox or that is they were twice as likely to develop monkeypox compared to those children who are not exposed to prairie dogs. Here's another example of an outbreak in a defined population, this time drinking raw milk. So, here we have a table of raw milk consumption in campylobacteriosis in participants in a third grade field trip to a farm. So, here you notice, among those who had no milk, nobody got sick. So, the attack rate was zero percent. Whereas, among children who had one or two glasses of milk, the attack rate was 35 percent and 60 percent respectively. So, in this case, the relative risk is infinite, 60 percent divided by zero percent is infinity, which provides pretty strong evidence that maybe drinking milk has something to do with getting sick. We have another piece of evidence here that it's often useful to look at and that's a dose-response relationship. So, what that is is here we see an increasing attack rate with increasing consumption of milk. So, people who had zero glasses of milk had no cases. People who had one glass of milk, 35 percent of those people were cases, and people who had two more glasses of milk, 60 percent of those people who had cases. So, this increasing dose-response relationship gives us further evidence that milk was the cause of the outbreak. So, to go over some key points from this section. The attack rate is the percentage of a group that gets the disease. The relative risk provides a comparison of attack rates between two groups. A relative risk above one indicates that the group in the numerator has a higher attack rate than the group in the denominator. Importantly, this approach only works, if the exposed and unexposed populations are identified without regard to their disease status. That is we can't be including people in our study based on whether or not their cases. We'll talk about what to do when this isn't true in the next section. This condition is most often fulfilled when there's a clearly defined population at risk, for instance, students in a school.