Okay. So, let's break down the convolution operation itself. So, just to remind you about the general architecture for a convolutional neural network, the idea is that you have some sort of input image, there are filters, convolutional filters that are applied to this input image via a convolution operation, and that operation is repeated as feature maps are built up over and over again to get to a high level representation of an image that can be fed to a classifier. So, what exactly is that convolution operation? The convolution is essentially a sliding of two signals, one over another, that helps search for particular features in that signal of interest. So, I'm going to introduce the convolution to you in one dimension. So, now we have two signals here, in 1D is signal f, a square wave, and a signal g, is sort of a single Sawtooth, and the mathematical definition of the convolution of f with g is given here at the bottom of the slide. If you unpack the right side, you can see that the idea is that you multiply g times f at every position from minus infinity to infinity, and then you sum up the result of that operation, basically taking the area under the curve of that resulting multiplication and then assigning that sum to a single value of the convolution. That operation is performed multiple times for various values of n which end up being various lags or shifts of the signal g relative to f, and I'll just show you a few examples of how that's done in this one-dimensional case. So, I'll add some reference values here. So, we'll add an axis zero, minus five, five. If I plug in minus five for n in that convolution equation, I assign minus five to n on the right side as well, and if I distribute that negative sign, what you'll see is now I have g minus five plus m. If you recall from algebra, having an m plus five, so f of x plus five, shifts the function to the left by five, so I'm showing here another reflection, that triangle relative to the blue by five, and then the negative sign performs a reflection along the midline of that signal, of the read signal, okay? So the result of that shift and then reflection as shown here. If I multiply these two signals together and then sum up the results, I'm looking at the area, sort of the intersectional area between the red and blue curves, and so this area is going to be the value of the convolution which in this particular case is 15. Okay? So, now if I apply a different shift zeros, so I plug in zero for n, actually do not shift the red curve at all relative to the blue, but I do reflect it, okay? Then again I multiply the signals, take the sum of the result, and in this case the area is shown here in yellow, which is slightly bigger than the area before. Okay, so that's why the result is 20. If I continue to shift this red curve over to the right more and more, I get different values of the convolution at different lags or shifts as I move g relative to f. So, in this case, I've moved the red over by 10, I reflected and then I've taken the area and I get a smaller value because the region of overlap is smaller between those two signals. Of course if I move this over substantially such that there's no longer any overlap, all the positive values of both of these functions are zeroed, your multiplication by zero, and I get no resulting signal or area. So, the value of the convolution here is zero once there is no overlap. Okay. So, here's just an animation of a similar process here just using two square waves that are moving g. The red one is moving across the blue one, and as the red curve is shifted from left to right, as it begins to overlap with the blue signal, there's then an area under the result of that multiplication, and that black line is now drawing out the value of that area and thus the value of the convolution for that particular lag, okay? So that's the whole convolution in 1D. What this means is that the convolutional filter g in this case can be used to specifically pull out features in f that match it, okay? So, if you have a matched feature, you're going to get a high value and the convolution. So, here's an example of that here. So, if I filter with g and a shift of zero overlap perfectly with f, we see that the result of that convolution would be area accumulated in those two left lobes at the top, and then a substantial amount of value or area accumulated at the bottom where you have negative values of g multiplied by negative values of f. But if we have this long square wave filter for g, when it's overlapping with f at zero lag, you do get positive values for the area on these two side lobes, but in the middle where previously you had negative times negative equals positive, where a positive times a negative equals negative, and so there's this very large negative area component subtracted from the yellow resulting in a lower value for the convolution with that mismatch filter and then with the matched filter on the left. Okay. So, that's in 1D, and the convolutional neural network we're trying to convolve in 2D image with a 2D filter, and so, you basically do the same exact thing except the filter is a 2D extent. So, here on the top is a three by three filter, it's being applied to a particular region of the image below, there's multiplication just like in the 1D case, and then a sum, and the resulting value of that sum is the value of the convolution at that particular shift. This filter is then moved along the image until you finally have a convolution every single points of the image with that filter. Just to show you what that looks like with a particular pattern, imagine the filter where a cross, and there was a cross-like feature in the resulting image as that filter with the crosses overlap slightly with the feature in the image, you get a positive value for the convolution, but as it overlaps perfectly with the underlying feature, you get a higher value here shown in darker purple and the resulting convolved image or feature map. As it begins to slide off of the feature, you get lower and lower values of the convolved image such that by the time you're done, you have a heatmap in the one corner corresponding to where that feature was in the image that you were searching for with the respective filter. So, just to show you now what number is what this looks like, if you have a filter here, W, you apply it to the image. Okay, this is just for one shift, so you apply it first into the top left hand corner, you do element-wise multiplication of all those values, then you take the sum over the rows and columns of that particular piece where the filters overlapping with the input and then you deposit the results of that sum in the corresponding locations that convolved feature map, in this case, value of four as a result of that convolution. So, this filter here I'm showing you on the left is actually a real filter that people use to extract features from real images, and actually is an edge detector, and just to show you how we can actually form a real meaningful map from running this filter over an image, we're showing here now is stock photo, black and white photo. I run this filter over the image with various shifts right. So, I just move this filter and slide it all over this image, and the result of that convolution operation is now an image in which I've detected the feature that filter was designed to detect, in this case, the edges of the image. Okay, and just to wrap up, just to really hammer home this point, so what happens when I'm doing this convolution in 2D to try to do feature extraction in the convolutional neural network, a filter, in this case, a circle in this example, is moved over the image left, right, top to bottom, and at the points where the filter overlaps with the feature corresponding to it, you get a big high-amplitude hotspots. So, when that red circle was overlapping with the blue circle, there was a lot of signal, and so you see in the resulting heat map on the right, a high value for the convolution that point, whereas when that circle is overlapping with shapes that are not match to it, you get lower values of the convolution operation, right? So, you can see how the circle as it's being passed over this image is highlighting particular features and forming a feature map on the right that that can be used by upstream layers of the network to understand the hierarchical properties and features within the input.