案例学习：预测房价

Loading...

From the course by University of Washington

机器学习：回归

3518 ratings

案例学习：预测房价

From the lesson

Multiple Regression

The next step in moving beyond simple linear regression is to consider "multiple regression" where multiple features of the data are used to form predictions. <p> More specifically, in this module, you will learn how to build models of more complex relationship between a single variable (e.g., 'square feet') and the observed response (like 'house sales price'). This includes things like fitting a polynomial to your data, or capturing seasonal changes in the response value. You will also learn how to incorporate multiple input variables (e.g., 'square feet', '# bedrooms', '# bathrooms'). You will then be able to describe how all of these models can still be cast within the linear regression framework, but now using multiple "features". Within this multiple regression framework, you will fit models to data, interpret estimated coefficients, and form predictions. <p>Here, you will also implement a gradient descent algorithm for fitting a multiple regression model.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

>> So the simple alternative approach is gradient descent.

Where, remember that the gradient descent algorithm,

we just initialize our vector of parameter somewhere and take these gradient steps.

And eventually, we will converge

to the optimum of this problem.

Okay, so what does this algorithm look like for multiple regression?

Well, it looks very similar to our simple linear regression,

where we say while not converged, we're gonna take our w parameters.

And we're gonna update them by subtracting sum step size atta times the gradient

of our residual sum of squares, at our previous set of parameters wt.

So what is our residual sum of squares?

Sorry, the gradient of the residual sum of squares, I'm writing right here,

so this update is w at iteration t.

The minus sign and this minus sign will turn into a plus sign.

Two eta times this h matrix,

h transpose y- Hw at iteration t.

And what is this here?

Well, h times w at iteration t is my predicted set of observations,

the whole vector of them.

Assuming that I use w at iteration t performing those predictions.

Okay, so what this version of the algorithm is doing is it's taking our

entire w vector, all the regression coefficients in our model, and

updating them all at once using this matrix notation shown here.

[MUSIC]

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.