案例学习：预测房价

Loading...

來自 University of Washington 的課程

机器学习：回归

3700 個評分

案例学习：预测房价

從本節課中

Simple Linear Regression

Our course starts from the most basic regression model: Just fitting a line to data. This simple model for forming predictions from a single, univariate feature of the data is appropriately called "simple linear regression".<p> In this module, we describe the high-level regression task and then specialize these concepts to the simple linear regression case. You will learn how to formulate a simple regression model and fit the model to data using both a closed-form solution as well as an iterative optimization algorithm called gradient descent. Based on this fitted function, you will interpret the estimated model parameters and form predictions. You will also analyze the sensitivity of your fit to outlying observations.<p> You will examine all of these concepts in the context of a case study of predicting house prices from the square feet of the house.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

Okay, while working with a simple linear regression model,

let's talk about how we're gonna fit a line to data.

But before we talk about specific algorithms for fitting,

we need to talk about how we're gonna measure the quality of a fit.

So, we're gonna talk about this orange box here which is this quality metric.

And in this case, now that we've mentioned that our function is

parametrized in terms of some parameters were calling w that

represents w zero and w one in this case or intercept in our slope.

We know that when we're going to predict house values,

instead of talking about f hat, our estimated function,

we can talk in terms of w hat, our estimated parameters.

Because those estimated parameters fully determine our estimated function.

So, we're gonna modify this block diagram, and replace f hat now with w hat.

And talk about estimating these parameters w.

So what's the cost of using a specific line?

Well the one we're gonna talk about here, and the one that we focused on

in the first course of the specialization, is Residual sum of squares.

And what Residual sum of squares assumes, is that we're

just gonna add up the errors we made between

this line here, which represents what we believe the relationship is,

and what we've estimated the relationship to be between x and y.

And what the actual observation y was.

So we're gonna take each one of these errors or

residuals and sorry, I should be clear.

I talked about error as the epsilon i.

The error was part of my model.

A residual is the difference between a prediction and an actual value.

Okay, so I wanna make sure that that's clear so

that's why this is called Residual sum of squares.

So this is the formula that we presented in the first course of the specialization,

so I'll run through it fairly quickly.

Our Residual sum of squares is gonna be a function of our two parameters, w0 and w1.

That determines what line we're looking at.

So of course as I change that line, we're changing the cost, this cost term, RSS.

And what I'm doing is I'm adding up the difference between

the specific house sales price of a given house.

And what my line specifies it as.

And what does the line specify the price to be?

Well it specifies it as wo + w1 times however many square feet this house had.

But I'm not just looking at that difference nor it's absolute value.

I'm looking at the square of the difference.

That's where the sum of squares comes in.

Well that's where the squares comes in, and

then the sum is I'm adding over this error over all houses in my training data set.

Okay, so just to summarize residual sum of squares is I take this difference

between what the line is telling me, the price of the house should be.

What the actual house price was, look at the difference squared and

add over every house in my training data set.

So we saw this equation before.

But now I'm gonna write it more compactly and we're gonna work with this form

throughout the rest of this module and of forms like this in the rest of the course,

where I've introduced this notion, this capital Sigma.

What this means is if I write Sigma, and I write and

i=1 one on the bottom of this Greek letter.

And a capital N at the top of this Greek letter and

what I'm saying is I'm summing over some quantity.

I'll just generically look at some quantity ai.

I'm summing a1+a2 plus all the way up to aN.

So I'm summing up n different qualities.

And in this case here, what is ai?

Well, ai is just this inner thing here so

it's yi- [w0 + w1xi]

quantity squared.

Okay, so this is just shorthand notation for what we had on the previous slide,

where we're summing over all houses in the training data set.

But instead of writing this thing in English or writing out this really massive

sum over 1,000 and 1,000 of houses, I'm gonna write it compactly like this.

Okay, so now that we have this notation,

let's talk about how we think about finding the best line.

So we have this function that if you give me any w0 and

w1 it defines a specified cost.

So for example this line, let's say that the intercept was 0.97 and

the slope was 0.85, well, that results in some RSS.

So let's call it just #1.

And then if I give you a different line, so that's specified by a different

intercept and a different slope, that's gonna result in a different cost.

So I'll just say that's some other number.

And then this other line with different parameters

has some other number associated with its cost as well.

And my goal here, when I'm talking about estimating a function from data,

given a specific model, which in this case is just a simple line.

Is I'm gonna search over all possible lines, all w0 and

w1 shifting this line up and down and looking at different slopes.

And I'm gonna try and find the one.

It results in the smallest residual sum of squares.

So out of these three lines, if those were the space of all lines I was looking at,

clearly it's not it's a huge space I'm looking at.

I would choose the one with the smallest number here.

[MUSIC]