[MUSIC] The question, are there any pre existing differences between groups can be very hard to answer. That's why it's useful to have a few follow up questions. This questions can help elicit why groups may not be probabilistically equivalent. Here's the first one. Is there a common driver of both group membership and outcome? Or, are group membership and outcomes both responding to a common factor? I know this sounds pretty abstract, but it'll become much clearer in a minute. Okay, why don't we start with an example where we already know that there are pre existing differences between groups. How about the search engine advertising example. Let's see whether this question can uncover why the groups along the x axis are not probabilistically equivalent. Can you come up with some factor that drives both which ads people see and whether they end up buying a car. Pause the video, take a minute to think about this, and then come back. I asked you to come up with some factor that drives both. Which ads people see and whether they end up buying a car. The answer of course is they have pre existing interest in buying a car. Those who are more interested in buying a car are more likely to type in a care related search term, which in turn makes them more likely to receive car ads and it also makes them more likely to end up purchasing a vehicle. That is what we call a common driver of both group membership, namely which ads you see, and outcome, namely whether you buy a car. Now let's look back at the ultrasound example. Can you come up with something that might drive both which ultrasound machine is used and how long exams take. Pause the video, take a minute to think about this and then, come back. Let me propose one possible common driver namely, technician experience. Here are ultrasound exam time data, split up by the experience level of the technician. Contrary to the initial exam time data we saw, it turns out that novice technicians are nine minutes faster using the 2015 machines. But they're not the only ones. Experienced technicians are also faster, using the 2015 machines, this time by six minutes. If I were looking at these data I would think hm, he hasn't really told me what fraction of technicians are novice experience. Let's say its 50/50. Then understand how one might get- 7.5 minutes. If we reported the exam times are for all technicians together. But how do we get from minus seven and a half to the plus one we saw in the first chart? The reason is that not all technicians care about ease of use. The novice technicians, they love the new machines, 85% now use them. But the most experienced technicians, they really know the 2008 machines and they don't see much value in switching. So 90% still use the old machines. When you ask the analytics dashboard to compare the efficiency of the ultrasounds, you are effectively pulling the measurement of exam time for the 2015 machines from usage by novice technicians and the measurement for the 2008 machines from the usage by experienced technicians. You were not really measuring how the devices compare. You were measuring how different technicians compare and that is why you got the wrong answer. Technician experience is a common driver. Experience determines both which ultrasound a technician is likely to use and how long exams take. And because of that, the exams performed on the 2008 and 2015 ultrasound machines are not probabilistically equivalent. Because the exams were performed by technicians with different levels of experience. That is why the causal statement that was implied by the original graph, if I switch my technicians from the old to the new machines, exam times would go up by a minute, is wrong and therefore, bad analytics. So, is there a common driver of both group membership and outcome is the first followup question when we have trouble answering directly whether there are any pre existing differences between groups. [MUSIC]