Having talked about the basis that we might use for forming market segments, how do we actually go about doing that? All right, well, that's where a technique called cluster analysis comes into play. And as the name would suggest, we are going to try to form different clusters, where clusters consist of what we'll call similar consumers. Now when we say similar we have to decide what is the basis on which we're going to assess that similarity? But whatever the basis that we choose, whether it's demographics, whether it's geography, whether it's a psychographic profile that's built based on a survey. The objective with factor analysis is to put similar people into the same cluster and people who are very different from each other should be in different clusters. Or we can think of this as I want the variance among people, the differences among consumers to be small within a cluster. But I want that variance, those differences that exist among consumers to be very big. Across different clusters. So if I put people in the same cluster, the group of people in that cluster should be relatively similar to each other. If I compare cluster one to cluster two I should see a bigger difference when I look between the clusters rather than when I look within a single cluster. As I've noted here, one of the most important decisions that you can make when you're doing cluster analysis, when you're doing segmentation. Is the question of what is an appropriate for the segmentation? We need to form the segments based on factors that actually matter to us. So that may be responsiveness to marketing activity. It may be preferences and opinions, attitudes that people ultimately hold and some of those are core values for brands. So whatever you believe your advantage to be, let's say we're dealing with brands that project personas. Well we're going to want to make sure we capture those differences in the dimensions that we're using for segmentation. What we don't want to do is kind of throw the kitchen sink in there, because we're going to get a lot of noise, and we're not going to be sure what to make of it. So you do want to give thought to what should we be putting in as the basis for segmentation. So give you an illustration of what cluster analysis is doing. So in this graph, we've got two bell curves that are normal distributions. And you can think of these as two different clusters. I've got one cluster formed by this distribution. And I have another cluster formed by this distribution. Well if we were to look at the overall difference between the two, we'd look at this particular difference between the peaks of the distribution. That's a measure of how big is the difference between those two groups. Versus looking at the within group variation. So what we're trying to do is assign individuals to the clusters such that the within group variation is small, the between group variation is big. As just a two dimensional illustration. Again, we're trying to form the clusters in such a way that people or consumers who are assigned to a particular cluster are more similar to each other than they are to the consumers in other clusters. So the variation of consumers within a cluster, relatively small compared to the variation in consumers that exist across clusters. So, the decisions that we need to make. Well, we've already talked a little bit about which dimensions or which variables get used to form these clusters. And this is why we want to marry this technique with factor analysis. Factor analysis tells us I get to take the entire survey, narrow it down to the core themes that reveal those underlying preferences and opinions that consumers hold. So rather than having 50 or 100 questions, maybe I've got ten or 15 factors. Well those factors, which convey a lot of the information from that survey and have revealed preferences and opinions. That can become the basis for segmentation. Next we've got to decide from a technical standpoint how do we go about doing the clustering itself. So what is the algorithm that we're going to use and how many clusters should we have? So, we're going to look at two different sets of approaches for conducting cluster analysis. And that's going to help us answer that question of, how many clusters do I actually need? And then the last step, one that the computers not going to be able to help you out with, is going to be naming those clusters. So based on the variables, based on the dimensions that went into the cluster analysis and the results that we get from cluster analysis. We're going to look to identify the different patterns that exist across clusters and say okay, well, that's going to allow us to build a profile. Of the prototypical customer that's in each of those different segments. So we'll use the results from the auto manufacturer example from when we talked about factor analysis. So we've factor analyzed that survey, we scored customers based on the different themes that we identified. And we're going to take a look at how many segments exist among respondents. How do we profile those segments? And so as far as which variables you want to include as I mentioned psychographic variables, the results from surveys. Very common to include those. One thing that comes across as, it seems a little bit counter-intuitive, but what you don't want to include in this clustering algorithm itself are the outcomes that you’re interested in. So let’s say you’re interested in understanding customer’s satisfaction, understanding customer’s likelihood of coming back and patronizing the store. We want to understand drivers of how much people are purchasing. Well we're interested in those differences, we don't want to include those in our segmentation. And the reason for that is if we include them in the segmentation can virtually guarantee that we're going to get segments that differ on that particular dimension. The problem is that the segmentation just tells us yes you have some consumers who are very likely to buy your product. Some consumers who are not likely to buy your product. It doesn't tell us who those consumers are, it doesn't tell us anything about those consumers. Or how to go about reaching those consumers. So what we're going to do is use the psychographic variables that we have, the survey responses that we've collected as the inputs to our clustering. We're going to form our different market segments. We're going to assign individuals to different market segments and then we're going to look at the average measure in this case purchase intention. Well what's the average purchase intention for each of those segments? And what we're looking for are differences in those purchase intentions that exist across clusters. Because if I find that then what it tells me is yes, costumers vary in terms of their likelihood of buying the product or how interested they are in your product. But what it also tells me is the physiographic profile of the costumers who are more interested in your product. And once we've done that, then what we want to look for is how do we figure out for a new costumer, somebody who wasn't a respondent on the survey? Well, based on information I can gather from them relatively quickly, how accurately can I assign that person to one of my segments. So the two most common approaches that we can use for conducting our cluster analysis. One is hierarchical clustering, or tree-based clustering. The idea here is we're going to try to form what the outputs ultimately going to look like a tree with a very thick core and then a lot of branches going out from it. And what we're looking at is customers being connected to each other, based on how similar they are to each other. So customers who are very similar to each other are going to be connected at one level. Customers who are very different from each other are going to branch off. And we'll see those differences emerge very quickly. The other example, is non-hierarchical clustering. And so we're actually going to talk about how we can use these two approaches together to drive the insights that we're interested in. Let's start with hierarchical clustering? When we're looking at hierarchical clustering there are two approaches that we can go with. One is agglomerative and one is divisive. Agglomerative clustering assumes that all of the respondents are individual clusters. Well, you can imagine if I've got 2,000 survey respondents, I'm going to have a lot of clusters. Well, what we then start to do is group respondents together, grouping together those individuals that are, most similar, initially. And we put the respondents who are most similar to each other into the same cluster. And we keep on grouping together based on the clusters that remain. So after, if I started with 2000 respondents, maybe I formed clusters where all of my segments are two respondents. So right now I've got a 1000 clusters of two. All right, well now I'm going to group those 1000 clusters together in such a way that I put together the most similar people again. And maybe that cuts us in half. Maybe we go from 1000 to 500. We go from 500 to 250. And we gradually work our way back so that we ultimately connect all of the customers. So if we've started with this big base, what agglomerative clustering is doing is kind of bringing us all back together. And so the higher a point where we're connected, the more different we are from each other other. That's one approach. Divisive clustering is going to say let's assume that we start with a single cluster. Now we're going to peel off, kind of one by one or in chunks. The respondents were most different from that main group. So if I have my main group, I'm going to peel off the most different respondents. Now I've got most of my observations in the same group. Now I'm let me peel off a cluster that is most different from who remain. And I'm going to kind of keep on peeling these pieces off in branches. And so, this is what is referred to as tree base clustering. Is because the visual that's produced, the dendrogram, very much looks like this hierarchical tree with different levels of branching. Now, this is a very visually oriented approach. It's up to the researchers to determine how many clusters are going to be appropriate.