So what the cluster assignment step does is it doesn't change the cluster

centroids, but what it's doing is this is exactly picking the values of c1, c2,

up to cm.

That minimizes the cost function, or the distortion function J.

And it's possible to prove that mathematically, but I won't do so here.

But it has a pretty intuitive meaning of just well, let's assign each point to

a cluster centroid that is closest to it, because that's what minimizes

the square of distance between the points in the cluster centroid.

And then the second step of k-means, this second step over here.

The second step was the move centroid step.

And once again I won't prove it, but it can be shown mathematically that what

the move centroid step does is it chooses the values of mu

that minimizes J, so it minimizes the cost function J with respect to,

wrt is my abbreviation for, with respect to, when it minimizes J with respect

to the locations of the cluster centroids mu 1 through mu K.

So if is really is doing is this taking the two sets of variables and

partitioning them into two halves right here.

First the c sets of variables and then you have the mu sets of variables.

And what it does is it first minimizes J with respect to the variable c and

then it minimizes J with respect to the variables mu and then it keeps on.

And, so all that's all that k-means does.

And now that we understand k-means as trying to minimize this cost function J,

we can also use this to try to debug other any algorithm and just kind of make sure

that our implementation of k-means is running correctly.

So, we now understand the k-means algorithm as trying to

optimize this cost function J, which is also called the distortion function.