案例学习：预测房价

Loading...

來自 University of Washington 的課程

机器学习：回归

4163 個評分

案例学习：预测房价

從本節課中

Assessing Performance

Having learned about linear regression models and algorithms for estimating the parameters of such models, you are now ready to assess how well your considered method should perform in predicting new data. You are also ready to select amongst possible models to choose the best performing. <p> This module is all about these important topics of model selection and assessment. You will examine both theoretical and practical aspects of such analyses. You will first explore the concept of measuring the "loss" of your predictions, and use this to define training, test, and generalization error. For these measures of error, you will analyze how they vary with model complexity and how they might be utilized to form a valid assessment of predictive performance. This leads directly to an important conversation about the bias-variance tradeoff, which is fundamental to machine learning. Finally, you will devise a method to first select amongst models and then assess the performance of the selected model. <p>The concepts described in this module are key to all machine learning problems, well-beyond the regression setting addressed in this course.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

So, the first measure of error of our predictions that we can look at is

something called training error.

And we discussed this at a high level in the first course of the specialization,

but now let's go through it in a little bit more detail.

So, to define training error, we first have to define training data.

So, training data typically you have

some dataset which I've shown you are these blue circles here, and

we're going to choose our training dataset just some subset of these points.

So, the greyed circles are ones that are not included in the training set.

The blue circles are the ones that we're keeping in this training set.

And then we take our training data and, as we've discussed in previous modules of

this course, we use it in order to fit our model, to estimate our model parameters.

Just as an example, for example with this dataset here,

maybe we choose to fit some quadratic function to the data and

like we've talked about in order to fit this quadratic function,

we're gonna minimize the residual sum of squares on these training data points.

So, now we have our estimated model parameters, w hat.

And we want to assess the training error of that estimated model.

And the way we do that is first we need to define some lost functions.

So, maybe we look at squared error, absolute error.

Any one fo the many possibilities for our lost function.

And then the way training error's defined is simply as the average loss,

defined over the training points.

So, mathematically what this is is simply 1 over N.

So, N are the total number of observations in my training set.

Some of the loss over each one of those training observations.

And just to remember to be very clear

the estimated parameters were estimated on the training set.

They were Minimizing the residual semi-squares for

these training points that we're looking at again and defining this training error.

So, we can go through this pictorially in the following example, where in this case

we're specifically looking at using squared error as our loss function.

And in this case, our training error is simply one over n

times the sum of The difference between our actual house sales price and

our predicted house sales price squared.

Where that sum is taken over all houses in our training data set.

And what we see is that in this case where we choose squared error as

our loss function, then

the form of training error Is exactly 1 over N times our residual sum of squares.

And I want to note here that there's some difference in convention that people use,

whether there's the 1 over N as the definition of training error, or not.

So, just be aware of that when you're computing training error and

reporting these numbers.

Here we're defining it as the average loss.

More formally we can write our training error as follows and

then we can define something that's commonly referred to just

as something as RMSE and the full name is root mean square error.

And RMSE is simply the square root of our average loss on the training houses.

So, the square root of our training error.

And the reason one might consider looking at root mean square error

is because the units, in this case, are just dollars.

Whereas when we thought about our training error, the units were dollars squared.

Remember we're taking the squares of all these differences in dollars.

So, the result is dollars squared.

So, that's a little bit less intuitive as an error metric than just

an error in terms of dollars themselves.

Now, that we've defined training error,

we can look at how training error behaves as model complexity increases.

So, to start with let's look at the simplest possible model you might fit,

which is just a constant model.

So this is the simplest model we're gonna consider, or could consider,

and you see that there is pretty significant training error.

So let's just say that that has some value here,

this is the training error of the constant model.

Then let's say I fit a linear model.

Well, a line, these are all linear models we're looking at, it's linear regression.

But just fitting a line to the data.

And you see that my training error has gone down.

So, some other value that I'm showing with this pink circle here.

Then I fit a quadratic function again training error goes down, and

what I see is that as I increase my model complexity to maybe this higher order

of polynomial, I have very low training error just this one pink bar here.

So, training error decreases quite significantly with model complexity and,

in total not that we've gone through these examples we can look at what the plot of

training error versus model complexity tends to look like.

So, there's a decrease in training error as you increase your model complexity.

And why is that?

Well, it's pretty intuitive, because the model was fit on the training points and

then I'm saying how well does it fit it?

As I increase the model complexity, I'm better and

better able to fit my training data points.

So, then when I go to assess my training error with these high-complexity models,

I have very low training error.

So, a natural question is whether a training error

is a good measure of predictive performance?

And what we're showing here is one of our high-complexity,

high-order polynomial models that had very low training error.

So it really fit those training data points well.

But how's it gonna perform on some new house?

So, in particular, maybe we're looking at a house in this gray region, so

with this range of square feet.

Question is, is there something particularly wrong with having

Xt square feet?

Because what our fitted function is saying is that I believe or I'm predicting

that the values of houses with roughly Xt square feet are less valuable

than houses with fewer square feet, cuz there's this dip down in this function.

Do we really believe that this is a true dip in value, that

these houses are just less desirable than houses with fewer or more square feet?

Probably not.

So, what's going wrong here?

The issue is the fact that training error is

overly optimistic when we're going to assess predictive performance.

And that's because these parameters, w-hat, were fit on the training data.

They were fit to minimize this training error.

Sorry, minimize residual sum of squares,

which can often be related to training error.

And then we're using training error to assess predictive performance but

that's gonna be very very optimistic as this picture shows.

So, in general, having small training error does not imply having

good predictive performance unless your training data set is really

representative of everything that you might see there out in the world.

[MUSIC]