[MUSIC] Hello everyone, this is Surya Kalidindi, I'm the Instructor for the Materials Data Science and Informatics class. In today's lesson we learn about spatial correlations. In particular we learn about n-point statistics. The desired outcomes for today's lesson are to define and understand n-point statistics. And in particular we'll go at some examples of applications of 2-point statistics as a specific example. In previous lessons, we learned about visual representations of an individual micro structure. Now we start looking at microstructure ensembles. An ensemble of microstructures is simply a collection of microstructures extracted from samples that are nominally subjected to the same processing history. Which basically means that because the samples have been subjected to the same history you expect the microstructures to be similar. Here you have an example of eight microstructures taken from eight different samples that are subjected to the same processing history. As you can see from this ensemble of microstructures, the microstructures do indeed look somewhat similar. But however, if you really compare one on one, there are some differences, and there are some differences are easily noticeable, some are not. So the target of the question here is, how can one quantify the statistics associated with this ensemble of microstructures. So let's go back to a single microstructure. What we had learned in the previous lessons is that we can represent the microstructure, and that this could evaluate. A single micro structure using the symbol msn. Now we are going to append a superscript. Now we are going to add this superscript j to identify each element of the ensemble. The goal then is to identify and quantify the statics of the ensemble by looking at the local patterns in this micrograms. The reason we are interested in the local patterns Is simply that that is the signature of the microstructure. You can think of it as a fingerprint of the microstructure. And what we want to understand are how are these local patterns different from one microstructure to another microstructure. Certain aspects of this local patterns could be the fact that this ellipsoids look like plates, and they're long and thin and they're oriented in different directions. So those are signatures of local patterns and we want to capture them in any statistics, statistical measures that we develop. And it turns out that these are indeed, what are called as spatial correlations and we're going to define formally, how to compute them and how to visualize them in this lesson. Starting with a discretized microstructure, one can easily think of defining very simple statistics. And the very first statistic anybody can think of in terms of data, is what is called as a expected value. So the expected value of a variable defined in that way. Defined as msn is simply the value of the average, over all the instantiations of that particular variable. So if you have j microstructures, you have j instantiation of that particular variable. And if you simply take the average you get what you would call a mean or an expected value. So fsn is the overall expected value which if you think about it is nothing but the overall value fraction of the local state of interest. The local state here is n and therefore this variable simply captures the volume fraction of local state n in this spacial bin s. But if you do this over in large number of microstructure data sets where j is large. So we're assuming tha this is large, then you would realize that the value of fsn should be independent of s. It shouldn't matter which particular cell you look at, because all of them should in fact approach the volume fraction of the local stadium. So if this is the case one can define a simple statistic by dropping the subscript s. We don't need that anymore because it's independent of s, and we can simply describe that as an average not only or all the microstructures, but actually, all the cells in the microstructures. If you do this, this is essentially what you would call as 1-point statistic. Again it's a 1-point statistic, because you're only looking at the local state presenting one of the boxes. You're not acquainting for anything in it's neighborhood. A building on this idea, one can define the higher order statistic, and this is we're moving the 2- point statistic. Our 2-point spatial correlation which happens to be the conditional probability associated with finding local state n at a particular spatial bin s. While simultaneously finding another local state p at another spatial bin s+r. To get an idea what that means for example, if I choose this as my s, this as my r, then s + r is nothing but this cell. So r again, you'll notice is indexed or discretized in the same fashion as you're discretizing the spatial bins. So if I want to know the probability of finding a particular local state n at the tail of this vector r. And the particular local state s at the head of the selected, at the head of the vector which is denoted by s + r. One simply means to multiply the probabilities. It's a joint probability that one simply means to multiply the probabilities and average over all the microstructures. Just like we did in the last slide, we will realize that when, if we do this or the large number of microstructures, the measure we defined is actually independent of s. We don't need s anymore, so it's independent of S. And because it's independent of s, just like we did before we can average not only out all the microstructure data sets. But we can also average out all the cells that are present in each microstructure. This use has a better or a more reliable estimated value. And the one detail that we need to pay attention to is that Sr is not the total number of cells, it's actually the number of spatial bins that allow the placement of both s and s + r within the microstructure volume being studied. As you can imagine if there's a particular vector r of interest, so suppose I'm interested in 2, 0. If I start with this cell and I put 2, 0 the header of this vector is no longer in the microstructure domain. Therefore, this vector cannot be counted or this particular placement cannot be counted. So typically Sr is less than or equal to S which is in this particular example, S is 16, and Sr would be typically less than 16, tess than s. However if we assume that the microstructure exhibits periodic boundaries, in other words the same microstructure is repeated, so the microstructure is repeated everywhere in all directions. If you assume that this happens then the number of cells Sr is indeed equal to s. So that's the special case, and we'll discuss both these cases separately as we go along. In summary for this slide, what we have is a definition of a 2-points statistic and that definition simply is given by this product and the summation. Now just to go over the detail of the non-periodic boundaries, because we said for non-periodic boundaries Sr is not equal to s. For the non-periodic boundaries as mentioned in the previous slide, we cannot use all the cells in s. We can only use the spatial cells that are inside a smaller volume. Obviously, for a vector of the size that is shown here for this vector, if I take that vector and put it in any of the gray regions, the lighter gray region, the head of the vector is going to go outside the domain. So this will not be allowed. Given that constraint one can see that Sr is simply defined by this correct, where S is defined by that product. And therefore, Sr is typically less than or equal to S. And other way to interpret this entire equation if you see that Is essentially same as this equation. But once we think in terms of what we are discussing here, it's easy to see that the numerator is a number of successes, the successes being that you are finding the local state n in this spatial bin s. At the same time, you're finding the local state p in this spatial bin expression. When all that happens simultaneously you call it a success. And the numerator is simply the number of successes and the denominator is simply the number of total trials. So fr np is the probability, is therefore interpreted as a probability and therefore this definition of fr and p will always lie between 0 and 1. It'd have to lie between 0 and 1. In summary for this close we understood that the concept of 2-point statistics is very important to quantify the statistics. In the microstructure ensemble, and we also understood that 2-point statistics defined the correlations between 2 points separated by a given vector, r. Thank you. [MUSIC]