0:05

This week, we're talking about how to model sounds by combining the sinusoidal

or harmonic approach that we talked in the previous weeks with

the idea of a residual or stochastic component.

And the core aspect of that is how to subtract

the sinusoids of a sound from the actual sound.

And this is what we are going to be talking about in this programming lecture,

from a programming perspective.

So we are going to be implementing the subtraction of sinusoids from

signal and we're going to do it through the hprModel,

the implementation of the model that we have within the SMS tools package.

So basically, we're going to be talking about this blocked

diagram in which from the input sound and

all the analysis that we have already been talking about.

The computing of the spectrum,

detection of the peaks in the case of the harmonic models,

detection of the fundamental frequency and identification of the harmonics.

Then we are synthesizing these harmonics in the spectral domain, so

we're synthesizing the lobes of the harmonics.

And now in this week, what we are focusing on is this subtraction.

So we're focusing on how these sinusoids

can be subtracted from the original signal.

And what we do is we compute again

the spectrum of the input signal with the same parameters.

So therefore with the same window that is implicit or

in fact is quite explicit in the spectral domain of this sinusoids,

so that then we can subtract them.

And then this residual spectrum can be sum to the harmonic spectrum and

obtain the time domain synthesize signal by combining the two.

2:30

So first we import all the packages that we need for this analysis, okay?

We already have seen all of these packages and then we read a sound file.

So we're going to start from the oval sound on the A4 and we need to specify

some parameters, in particular we're going to be analyzing only one frame.

So we choose the location,

the pointer where we're going to be reading the sound from, so

the place for the 1,000 sample.

3:09

And then, well we need the window size m, we take 801 samples,

we need an FFT size bigger than that, so we take 2,048.

And with the threshold for identifying the peaks and

since we're going to be doing harmonic analysis, we need a minimum or

maximum frequencies to look for the the fundamental frequency.

Then we need a nearer threshold to have the lower bound for

the algorithm that we are using for F0 detection.

The two waves mismatch error function then we will decide how many harmonics,

the maximum number of harmonics that we are going to be identifying.

And this is the deviation slope that we had choose so that

the higher harmonics have a higher degree of deviation than the lower ones, okay.

So these are the parameters and now, we start by computing things.

So first, we compute the analysis window and we take the Blackman window,

4:19

then since we're going to be doing this subtraction,

that phase information is fundamental.

We need to do zero phase window, this is where it's

fundamental the idea of centering everything around zero, so

that the time domain basically wave forms they align perfectly.

So that the faces match and therefore the subtraction is possible

and here the outside window is again fundamental, so

that's why we need to compute the center of the window in this way.

Okay, then we choose the fragment of the sound and here the pointer.

So we choose the center of the window and

we take half of the samples from the left of this pointer and

half of the samples on the right of this pointer.

Okay, so this is going to be our sound that we analyze and

then we have seen all these before.

We compute the DFT of this fragment of the sound, we find the peaks,

we interpolate the peaks with parabolic interpolation and here we convert

locations to hertz, and now we detect the fundamental frequency.

Okay, so we called a two way mismatch algorithm and

it returns the best candidate for fundamental frequency.

And we identify all the harmonics by calling their harmonic

detection function that looks for the peaks that are closest to the multiples on

the fundamental frequency, okay.

And now, we can synthesize this harmonics that we have identified and

we synthesize it with another FFT and window that we did the original analysis.

So now, we take an FFT of 512 and half of that is 256 and

we're going to be synthesizing the spectral signs,

so the main slope of the blackman window, okay.

And this is the yh is the complete spectrum of this harmonic component.

6:40

And now this is what we are basically focusing on this week,

we have to subtract these harmonics from the original signal.

But to do that, we need to recompute the spectrum of the original signal, so

that we use the blackman harris window that is in this spectrum.

7:20

of the fragment of the sound but only the 512 samples around the pointer.

So here, we choose another original signal with these 512 samples and

multiple it by the blackman harris window normalize,

so that then it becomes easier to compute, to do the subtraction.

7:50

Okay, and so this our new input signal and then we have to zero phase window 8.

So we have to put it around zero and we do that by defining the FFT buffer,

we're going to use and center everything around zero, and

then we can compute the spectrum of that.

And the subtraction of the sinusoids then becomes easy,

it's just a complex subtraction.

We subtract the harmonics,

the harmonica spectrum from these new spectrum that we just compute.

So okay, so let's run this and let's step through the different variables

that we have been generating, so this is the file called test three.

8:39

Okay, so this has computed it and now, let's keep looking at the different

variables, so for example, let's plot the x1 sound, okay.

This is the fragment of the sound we are analyzing, okay,

then we can plot the resulting spectrum that we computed from that,

so we can plot mx, okay?

So this is the in DV, the magnospectrum of that.

9:10

Then it has compute out of that peak, so we can even just print

the locations of the peaks, so iploc is the peaks or

the peaks that it has found within the array of the FFT.

It's better to show it in Hertz so we print IP

frequency, is the frequency of the peaks it has found, right?

Then this has gone to the F0 detection,

so it has identified the fundamental frequency.

And the fundamental frequency has been chosen to be 443 Hertz,

which makes sense, we're analyzing a novo sound, an A4.

And then out of this fundamental frequency it has chosen

the peaks that are harmonics of these, so each frequency is

the set of harmonics that it has identified in this particular location.

So it has identified all these harmonics,

the other peaks have not been considered harmonics.

Of course, we have chosen a threshold, and

a given set of parameters that has limited the number

of harmonics to these 6,000, so it becomes easier,

so we just have analyze up into this harmonic.

Okay, then we generate these harmonics as a spectrum and

so it has generated yh, so we plot the absolute value of yh.

We're going to see the magnitude

of the complete spectrum, but

let's plot it just in DB so we will just

plot(20*log10(abs(Yh.

And let's just take only, let's say the first 70 samples,

so from the beginning to the sample 70.

So we focus on the first harmonics,

which are the ones that basically we have generated, okay.

So these are the harmonics that

we basically we have the synthesized spectrum.

On top of that we can plot the signal that

from the original sound that we have recomputed so x2, the x2 spectrum.

Okay, so we plot on top of that the absolute value of x2,

12:04

in here, we now see the green line which is basically the original spectrum,

and as you can see, it's very much similar.

So, that means that we very much have synthesized the original spectrum.

And then we can now plug the subtracted spectrum, so

the subtracted is the xr spectrum and this is this red line.

Okay, so the red line is the subtraction of the two and

it clearly shows the residual that it has.

Okay, so this works and basically we do that at every frame and

then of course we can do the inverse to generate all these signals back.

12:51

Now let me show you the actual code that is in the SMS tools package

that performs these harmonic plus residual modelling.

So there is this file call hprModel.py within the models directory,

underneath there is the analysis and synthesis of HPR modeling.

There is one function that does the analysis,

another one that performs a synthesis, and then there is another one that

does both the analysis and the synthesis at one frame at a time.

13:27

In fact, we recommend very much to do the analysis and synthesis separate so

that we can take advantage of cleaning the trajectories and

having some memory in the tracking.

So that the harmonics are better, and

in the analysis part of the function, it basically

calls two functions, one that we already have seen.

It does the harmonic analysis in the same way that a we saw it when we talked about

the harmonic model and then the new thing that it does is the subtraction.

So there is this function, sine subtraction, that has this input,

the harmonics identified by the harmonic model.

Has it input also the input sound again and

it subtracts the harmonics from this input sound.

So these function is in the utilFunction directory in the file and

in here you will find the sign subtraction function and

it basically does what I have explained just a while ago.

It goes through the sound and

then at every frame it does the subtraction of the sinusoids, so

the sinusoids have already been been identified.

So here, it iterates over all the analyzed frame, and

then it reads again a fragment of the input sound.

It recomputes the input sound with this window with the blackman harris window,

and then it synthesizes the harmonics with this blackman main lobe.

And then it's able to subtract

15:19

the harmonics from the input sound, and that's all.

And then it synthesizes the residual signal back and

that's the windowing effect, so we can do the overlap at correctly.

And that's all, that's what the sinusoidal subtraction does and

it returns of course the residual signal.

And the output of the analysis is the harmonic frequency's,

magnitudes and phases, and this residual signal.

16:22

Then we have this other function, hprmodelfunction.py

which is the file that puts it all together and it's the in

fact the file that is called from the interface that we have been using.

And it's simply, there is one function, main and that does

the analysis and synthesis of a sound and it plots the intermediate values.

So it calls the hprModelAnal, okay, and

then well it computes the spectrogram of the residual so

that we can show it as a spectrogram.

And then it performs the synthesis and it generates the overall out put sound

the sum of the residual plus the harmonics and just the harmonics,

and it writes the files into the same directory.

17:21

Okay so now we can execute this file then run this

HPR Model function file and it will analyze

a fragment of a saxophone sound, which is the default sound with the hpr model.

So we will get here, we see the harmonics of the sound and

the background is the spectrogram of the residuals.

So we have computed the spectrogram of the residuals and

here are the two together, and of course we can listen to them.

We can just in this directory it has save the residual sines and

the sum of the two, so we can just play the for

example first the residual.

[SOUND] Okay. And then we can play for

examples the sines.

[MUSIC]

And we can play the sum of the two.

[MUSIC]

Okay, and that's all, basically we have gone

through the harmonics plus residual model.

And we have used quite a bit of code from the SMS tools and

from Python to be able to subtract the harmonics from the residual.

18:48

And hopefully that has given you a programming perspective

to this idea of harmonic plus residual subtraction.

And it's not a very sophisticated thing from a signal processing point of view,

but from a programming point of view, it's quite delicate.

We have to be very careful in how to put the signals and

how to analyze them in order that they align perfectly.

And in order that the complex subtraction in the spectral domain is done

correctly and we obtain a real residual, and that's all.

So in next class we're going to take the next step which will be to,

from a programming perspective look at the stochastic components.

So how we can then convert these results that we just computed

into a stochastic component and then have the harmonics plus stochastic model.

So see you next lecture, bye bye.