Learn the fundamentals of digital signal processing theory and discover the myriad ways DSP makes everyday life more productive and fun.

Loading...

來自 École Polytechnique Fédérale de Lausanne 的課程

数字信号处理

471 個評分

Learn the fundamentals of digital signal processing theory and discover the myriad ways DSP makes everyday life more productive and fun.

從本節課中

Module 6: Digital Communication Systems - Module 7: Image Processing

- Paolo PrandoniLecturer

School of Computer and Communication Science - Martin VetterliProfessor

School of Computer and Communication Sciences

We will define the concept of filtering in the context of image processing.

We will classify the type of filters that we use in conjunction with images and

then we will finish with some examples.

A good starting point to define the filtering operation in the space of

digital images is to extend that the concept of

one day filtering to two dimension.

And the ideas that admit a natural extension to 2D

are linearity, the concept of invariance,

which in the case of images becomes space invariance rather than time invariance.

The concept of impulse response and it's transformed the frequency response.

The concept of stability, and the concept of the constant coefficient difference

equations, which in this case will be two dimensional.

With these ingredients, we already have a fully functional filtering paradigm for

2D signals.

However, when it comes to images, the problem with linear space invariant

operators is that images are a very specialized type of 2D signal.

As we said before,

images are designed to be interpreted by the human visual system and as such they

contain a lot of semantics which is lost on simple operators such as filters.

Consider for instance the photograph of the scenery outside of your window.

In the same picture very many different things co-exist.

You have pictures of people, of cars, of building, maybe the sky,

maybe some natural landscape.

A space-invariant filter will process every item in the same way.

But, it is kind of intuitive that we would like

different things to be processed differently.

Edges, for

instance, should be treated very differently than gradients like the sky.

And textures represents a new challenge altogether.

Nonetheless there are some symbol operations that can be performed with

standard linear filters.

And therefore we will continue in the tradition of one dimensional processing,

and put in place a classification of filters according to their properties.

In 2D, we can distinguish as well between IIR and

FIR filters based on the support of the input response.

We can distinguish between causal and noncausal filters.

And we can classify filters according to the frequency response.

In particular, lowpass filters will be used to perform smoothing operation

on images where as highpass filters are used to enhance an image and

perform edge detection.

Speaking of edges, here we come to the first issue

with respect to IIR filters in image processing.

You remember edges are points of discontinuity in the greyscale signal.

And as such, they require that the phase

of the foyer components that make up the image are precisely aligned.

Now, since you cannot have an IIR filter with linear phase the result of the filter

in operation will always affect negatively the edges if we use an IIR filter.

A second problem with IIR filters is related to border effects.

Consider the convolution effect operation in 1D.

We have the y of n is equal to the sum for k that goes from-

infinity to + infinity of our response in k times x of n- k.

Now, if the signal x is finite support.

When the index k ranges from- infinity to + infinity, we have to make an assumption

for the values of the signal outside of its support.

And usually we assume that this values are 0.

So there are two border points in a 1D final support signal, its beginning and

its end.

When we compute the output of the filter in operation, the choice for the value of

the input signal outside of its support will influence a number of output points.

If the filter is FIR and of length L, at most L points after the beginning

boundary point of the signal will be affected and the rest will be okay.

If we use an IIR filter on the other hand,

because of the recursive relation, all the output

will be affected by the fact that we chose some 0 values outside of the support.

In a finite support 2D image, the number of border point

is proportional to the square root of the number of pixels contained in the image.

So the effect of the border will be exacerbated and

it will come from all sides or from all directions.

So in general, we tend to prefer short FIR filters in image processing

in order to minimize the effects of the border on the output image.

Another issue is related to the design of stable IIR filters.

You remember that in one dimension, we can check for the stability of a filter

simply by looking at the position of its poles on the complex plane.

However, the fundamental theorem of algebra does not hold

a multiple dimensions, and therefore,

there is no simple way to find the roots of a multidimensional transfer.

As a consequence, there is no simple stability criterion for

two dimensional filters.

A final issue with the multidimensional IIR filters is computability.

In other words,

we have to be careful because in multiple dimensions, we could come up

with constant coefficient difference equations that are not computable.

Here is a simple example.

Consider the following filter where each output sample

is computed as a linear combination of four previous output samples,

this is what makes a filter IIR, plus the input.

So here in order to compute the output in 0,0,

we need the contribution of these four samples.

There are going to be sum together, but for

instance to compute this sample here which is necessary to compute the output in 0,0.

We need to sum the following four previous output samples.

Okay, so this sample now depends on this sample here.

But this sample is what we were trying to compute in the first place.

So we have an unbreakable self referential loop

that makes this filter non-computable.

Okay, so IIR filters are pretty much out for us at least.

So let's look at some practical FIR filters.

One of the advantages of having the whole image available for processing

at the beginning is that causality is no longer an issue, as we said before.

So we can design FIR filters, whose impulse response is symmetric around zero

and therefore they introduce no delay.

A consequence is that the number of taps of the input's response will be odd

in both dimensions.

So, for instance, something like this.

The per-sample complexity of an FIR filter is,

as we said before M1 time M2 operations per sample where M1,

M2 are the dimensions of the support of the filter.

However in the case of separable FIRs,

this computational requirements drop down to M1 plus M2.

And of course, just like in the 1 DKs, FIR filters are always stable.

So let's revisit some classics.

Moving average in two dimensions

is a simple separable extension of the 1D moving average.

Remember in 1D we would take a window an average of the samples under that window.

In 2D, we take a square window over the plane and

we average the pixels that fall over this window.

So mathematically, we express that as for each output point at coordinate n1 and n2.

We take the sum of all the pixels centered around and 1 and then 2 for

an extension of 2 capital N plus 1 points in both directions.

Of course have normalized the sum by the number of points that we used in

the average and that turns out to be 2 capital N plus 1 squared.

The input's response is again an extension of the 1-D case.

In the 1D case it was a simple rect.

Here it's a two dimensional rect, where the rect extends over

2N + 1 points in both directions, and is normalized by (2N + 1) squared.

Needless to say, the moving average is a separable filter.

And therefore the number of operation per samples will be simply 2 times N.

We can represent the impulse response of an FIR filter in two dimensions

as a matrix,

whereby convention the center tap of the filter is a center point in the matrix.

So in this case, we have a moving average of three by three points,

where every element in the matrix is 1 and the normalizing factor is one over nine.

Let's try and apply the moving average to our original image

which is a 256 by 256 picture.

Here we have a moving average of side 11 pixels and

we can see that the effect is to blur the original image.

If we push the dimension to 51 points the blurring becomes very severe.

You can see here around the image the effect of the border

that I mentioned before in the context of IRR filtering.

This is the width of the impulse response and we see this discontinuity

because after 51 points, the zeroes that we assumed to be outside of

the original image no longer influence the output of the moving average filter.

Another popular low pass filter for images is the Gaussian blur.

In the Gaussian blur, we take a impulse response,

which is a two dimensional Gaussian function.

A cross section of this impulse response if we were to plot it

would look like this, the typical Gaussian characteristic.

So this filter computes a moving average, where the pixels

away from the center of the filter are weighed by a Gaussian characteristic.

Now the Gaussian function, whether one or

two dimensional, is not a finite support function.

So we arbitrarily truncate it and

set the input's response to 0 after N- 1 samples, where

N is approximately 3 times the standard deviation of the Gaussian characteristic.

If we were to plot the input's response in Cartesian format, it would look like this.

We could also plot it as an image, and here you can see that

we're weighing more the points that are close to the center of the filter and

weighing less the points that are close to the corners.

The Gaussian impulse response has a perfect circle of symmetry and so

it is separable.

You can implement it as horizontal Gaussian filtering followed by

a vertical Gaussian filtering in one dimension.

The result is that we're less sensitive to border effects.

By appropriately choosing the standard deviation and

the support of the filter, we can achieve arbitrary smoothing power.

Here, for instance, we have a Gaussian filter with a standard deviation of 1.8,

so we choose N in this case to be approximately three times this,

which is that to be around 5.

And so we have an 11 by 11 square filter.

Here, the standard deviation is 8.7 and

we choose N to be 25 in order to get a 51 x 51 blurring filter.

Now you can see that because of the smoothing characteristic of the Gaussian

impulse response, we're less affected by border effects.

There is still a darker halo around the border of the image but

it's less pronounced, because as we moved inwards,

the zeros outside of the image are weighed down by the Gaussian characteristic.

Let's now look at same high-pass filters for image enhancement.

The Sobel filter is a high-pass filter that computes an approximation

to the first derivative, either in the horizontal or in the vertical direction.

Let's look first at the horizontal Sobel operator.

This is a three by three FIR filter which we can express in matrix form as such.

Remember the center item in the matrix corresponds to the origin of the filter.

It turns out the Sobel operator is separable, and

it is composed of a vertical filtering operation on three tabs given

by this impulse response, followed by a horizontal filtering operation

on three tabs given by this impulse response.

Now you remember we have seen in modules 6.6 that

the impulse response of the discrete time differentiator is like so

h of n is equal to 0 in 0 and

equal minus 1 to the power of n /n for n not 0.

So this three tap filter here is the three tap

approximation of the ideal differentiator in this this great time.

The first part here is a three tap low pass filter.

The impulse response looks like so, and what this filter does is average together

three subsequent lines in the image before we computed the differentiation.

This helps us combat the noise that a simple differentiation operation

would end up amplifying.

The vertical Sobel filter approximates the derivative in the vertical direction.

And its impulse response is simply the transpose of the horizontal Sobel filter.

If we now apply the Sobel filter to our usual image, we can see that as we expect

from our differentiation operator, the uniform areas are sort of cancelled

out and the points of this continuity, such as the edges, are enhanced.

In particular, the horizontal Sobel filter approximates the derivative in

the horizontal direction and therefore is particularly sensitive to vertical lines.

So here you see that this lines in the vertical direction are enhanced.

Conversely, the vertical Sobel filter differentiates in the vertical direction

and therefore will enhance the horizontal lines or

the lines that go pretty much in the horizontal direction.

We can combine the effect of horizontal and

vertical differentiation Into the Sobel operator.

Which is an approximation of the square magnitude of the gradient.

The operator, called operator because it's not a linear filter,

is indicated by the symbol nabla, which applied to an image X of N1 and

N2 gets square magnitude of the horizontal filters output,

plus the square magnitude of the vertical filters output.

If we applied those Sobel operator to the image,

we obtain what we see here on the left.

It is customary to threshold the image on the left and

to reverse the role of black and white in order to obtain a contour image.

In other words, this image here, y of n1 and

n2, can be defined as zero if the Sobel operator

is gradient on a certain threshold, and

255 if is less than a certain threshold.

In other words, this image on the right can be defined as such.

The pixel at coordinate n1 and n2 will be equal to

zero if Sobel operator is greater than a given threshold.

And it's going to be equal to one otherwise.

Finally, let's have a look at the Laplacian operator in image processing.

So if we have a function in continuous-space,

the function of two continuous variables, t1 and t2.

The Laplacian is defined as the sum of the two

second-order partial derivatives with respect to the variables of the function,

the unmixed partial derivatives as they're called.

The Laplacian in this case measures the curvature of the function

Interpret as a surface in 3D space.

And so a high value for the Laplacian indicates a point of high curvature.

Now if you consider that an image is a curve in 3D space

where the level of the curve indicates a grayscale level of the image

than a region of high curvature most likely indicates an edge.

So the Laplacian operator gives us another way to find edges in an image.

But before we can apply the Laplacian operator to digital images,

we have to find a discretized representation for this operator.

To do so we start by considering the Taylor expansion

of a generic function of one variable f of t.

And we can write f of t plus tau, where tau is a small increment as the sum for

n that goes from 0 to infinity of the nth derivative of f computed in t,

divided by n factorial and multiplied by tau to the power or n.

We now use Taylor's expansion formula to compute the values of f

in (t + tau) and (t- tau).

So in (t + tau), we start the expansion and we stop at the second order term.

And we get f(t) + f'(t) times tao plus one-half

second derivative of f in (t) tao squared.

And for t- tao,

we get exactly the same formula with the difference of the sign here changes.

Now if we sum these two things together and we rearrange the terms,

we get an approximation for the second order of derivative of

the function in t that can be expressed as 1 over tao squared

that multiplies (f)t-T)-2(f(t+T)).

Now this is linear combination of three values of the function f,

computed at regularly spaced intervals on the plane,

namely in t- tau, t, and t + tau.

So they're all separate by an interval of width tau.

So if we now go to the integer grid by setting tau equal to 1, we can see that

this is nothing but a three-tap FIR filter of impulse response 1 -2 1.

So we can approximate the second derivative at a given point with the zero

centered FIR filter, like so.

Now let's translate this in the case of an image.

Remember, we are trying to approximate the Laplacian.

And of course the Laplacian is for a function f,

the sum of the partial derivatives of second order.

So first of all, we have to embed the one dimensional FIR that we just found

into a 2D filter.

So if the input response for one dimension is [1-2 1],

we can always convert this by putting zeroes everywhere, right?

So we would have 0 0 0 here, 1- 2 1 0 0 0.

And this gives us an approximation

of the second holder derivative horizontally speaking.

The vertical component will be just the transpose of this.

And if we sum these two guys together to obtain the Laplacian, we have a three by

three FIR approximation to the gradient with this impulse response.

If we now apply the Laplacian to our usual image,

we get an enhancement of the points of this continuity.

To make this more visible we first threshold the result of

the Laplacian operator.

And then we reverse the roles of black and white.

And this is the image that we get.