[MUSIC] So you might wonder what on earth does the idea of a probabilistic equivalence and experimentation have to do with the practice of marketing analytics. Here is the key observation. The number one reason why analytics goes bad is that data that were not generated as part of an experiment are presented or interpreted as if they had been. Let me say that again. The number one reason why analytics goes bad is that data that were not generated as part of an experiment are presented or interpreted as if they had been. So let me give you an example. Do you remember this data from? If it had been true that I had a group of consumers and I had randomly allocated them to receive zero to two emails, or two to four emails, or four or more emails. And then, I measure that these people generated 51 euro in revenue, 254 euro in revenue, and 580 euro in revenue, I could unequivocally state that more emails drive higher revenue. But that is not how we allocated people to these three groups. In fact, we allocated people to these three groups as nonrandomly as one can possibly get. Namely, by setting their email frequency proportional to how many categories they had purchased then, and therefore, on average how much revenue we got from them. It doesn't get much less random. Let me give you a second example. You remember this graph, on the effectiveness of search engine advertising that I saw at the executive retreat? Now, if it were true that I had taken consumers who had typed in a search term related to our industry and randomly allocated 25% of them to not see any ads related to a product in our industry. And I had randomly taken another 25% of consumers and showed them only retailer ads and randomly another 25% and only shown the manufacture ads. And finally, the last 25% and had shown then both ads. And it then turned out to be that I found the pattern shown here, I could unequivocally say that search engine advertising works. And that manufacturer and retailer advertising are complements and not substitutes. But that is not how people got allocated to these four groups. The way they got allocated is, again, completely nonrandom. Namely, they typed in a search term. That search term told me something about their interests, and that is what the search engine targeted. In fact, the whole business model of search engines is to get away from random targeting. It's all about the fact, that you take people with a preexisting interest, and those are the people that you target with ads. So now that we have an idea of why analytics goes bad, let's get started in developing a list of questions that will allow us to systematically uncover when analytics has problems. [MUSIC]