We have to choose what attributes we'll use and how we'll measure them. And what metrics we will have for our objectives. And these are important things as we set up our data analysis. Let's begin by looking at what attributes we choose. This is something where our choice is usually limited by what is available. If there are only some properties that we know of the things that we're trying to deal with, that's what we have. Now it's possible that additional attributes can sometimes be purchased or collected, and determining what that might be requires a cost value of tradeoff. And it's important to think about what attributes are missing? In other words to make this less abstract, let's say that we're trying to decide whether to hire someone. And we look up what we know about them on the web and from their application, and whatever we know about with respect to their record. The question is do we know anything about, let's say what time they go to bed and perhaps, that's important because of the nature of the job that they're being hired for. And that is something that maybe one could determine if one had a smart electric neither in their home from which we could buy data. But, otherwise, it's something that is an attribute we probably do not have access to. There are many choices of this type that a data scientist has to make in just setting up the problem that they are trying to solve. There are also attributes that they may decide to leave out. And sometimes what they're going to leave out is something that's dictated by law. So for many purposes, you cannot consider the race of a particular individual. If that is the case, then even if you know a person's race, that is something that you would not consider as input to your algorithm or your analysis. Just as you're making choices with respect to your attributes, you're also making choices with respect to the metric against which you are going to evaluate your output. So let's consider an example. This is a real-life example. Kim Kardashian has 15 million followers on Twitter, and a particular Company X paid her to tweet about its products. Now here's what happened. She has 15 million followers, 15 million people presumably saw her tweet about Company X. 1,200 of these visited the company X website. 30 of these 1,200 placed orders worth $30 each on average. So the company got $900 on sales. And there's still some question of whether all $900 of these were actually attributable to Kardashian's tweet. Maybe some of these would've bought products from the company anyway. It's also possible there are people who saw her tweet, then immediately visited the company's website and came back later, and they're not accounted for in this 900. Let's worry about these things later. Let's sort of ignore these. This is a very standard kind of how data analysis gets done. We just made a big assumption about, yeah, there are these questions. We don't know the answers to these. Let's assume those things are zero. And do I know that's a good assumption? No, I don't. And if those assumptions are bad, my results of the analysis are going to be bad. But sticking with that assumption for a minute, the question is if we want to figure out the value of buying Kim Kardashian's time for this tweet, there are many ways you can do this. You can say hey, I got 15 million people to have an impression, to see my ad. Okay, and it's worth let's say a tenth of a cent per view. So having Kim Kardashian do this for me is worth $15,000 because she has 15 million followers. Another way to do it is, it's worth $1 to me for every new customer who comes through my store. So Kim Kardashian able to get 1,200 people to visit my website and since each new visitors worth a $1 that is worth $1,200. The third way to do it is to say, no, I want to measure how much profit I got because of this. So I got $900 in sales and let's say I have a 20% profit margin, so I got a $180 of extra profit because of this tweet. And so if I've gotta do a contract and the question is, how much I will be willing to pay for her to tweet about my company's product. The answer could be anywhere from some fraction of $180 to some fraction of $15,000. Depending on what the measure is that I am using as the thing that I am looking for. Well, here I kind of set up this problem to make clear that what one was really looking at is the sales and often one may not have that and therefore one is pricing things on things like page views or impressions and the results can be astonishingly different.