the values from the first sample.
I'm going to use this first color from this color amp, and
I'm going to plot it here on this.
So here is the distribution for the first sample and then what I'm going to do is
I'm actually write a loop that loops over each of the other sample,
so it's going to go from 2 to 20, because I already did sample one and I'm going to
make 20 of the samples I'm going to make a density platform so in each one I'm
going to use lines to overlay another line from the coloring on top of that,
so when I do that I can see that some of those samples have nearly identical values
and some of them have big distributional differences between the samples.
That's likely due to technology and not due to biology.
So one thing that we can do is do quantum normalization like we talked about.
That's basically going to force the distributions to be exactly the same.
And so the way I'm going to do that is using that pre-process core package,
I'm going to use the normalize.quantiles function.
And then I'm going to convert this to a matrix and apply it.
So now I have a new data set,
what this returns is a new data set of the same size.
So if I look at dimensions of edata and the dimensions of norm edata.
We need to set this exactly the same size, but
where things have been quantile normalized.
So now, again, I can make a density plot for the normalized data
of the distribution and it looks like this after normalization.
And then I can again loop over the first 20 samples and
add lines and over layed and on top of that plot.
And so
you see when I do that, they basically all land right on top of each other.
Now there's a little bit variability down here on the low end that's because
the quantiles for the very low values are difficult to match up,
so often you'll see a little bit variation here in the low values or
the really high values in the quantitative normalization.
But for the most part, the distributions lay exactly on top of each other now.