这门课程介绍一元和多元线性回归模型。 这些模型能够让你获得数据集和一个连续变量之间的关系。（比如说：）在教授的外表吸引程度和学生的评分之间有什么关联么？我们可以根据孩子母亲的特定特征来预测这个孩子的测试分数么？在这门课程当中，你将会学习线性回归的基本理论，运用免费统计软件R、RStudio分析一些数据例子来学习如何拟合、检验，以及如何利用回归模型去检验多元变量之间的关系。

Loading...

From the course by Duke University

线性回归和建模

666 ratings

这门课程介绍一元和多元线性回归模型。 这些模型能够让你获得数据集和一个连续变量之间的关系。（比如说：）在教授的外表吸引程度和学生的评分之间有什么关联么？我们可以根据孩子母亲的特定特征来预测这个孩子的测试分数么？在这门课程当中，你将会学习线性回归的基本理论，运用免费统计软件R、RStudio分析一些数据例子来学习如何拟合、检验，以及如何利用回归模型去检验多元变量之间的关系。

From the lesson

Linear Regression

In this week we’ll introduce linear regression. Many of you may be familiar with regression from reading the news, where graphs with straight lines are overlaid on scatterplots. Linear models can be used for prediction or to evaluate whether there is a linear relationship between two numerical variables.

- Mine Çetinkaya-RundelAssociate Professor of the Practice

Department of Statistical Science

Before we can go on to modeling the relationship between two

numerical variables using a regression, we first need to define residuals.

Residuals are basically leftovers from the model fit.

So we can think about our observed data as the model fit plus the residuals.

The residual is defined as the difference between the observed and the predicted Y.

So the observed value and the predicted value of the

response variable for a given data point in our dataset.

So we can write the formula for the residual as and we denote it

using an E for error, EI is equal to YI, the observed response variable.

Minus Y hat I, the predicted response variable.

We're going to focus on two data points, Rhode Island and DC.

The observed poverty level in Rhode Island is around

10%, and the predicted poverty level is slightly over 14%.

The difference between these two, which is shown with

the yellow line on the plot, is the residual.

And what the residual tells us that, the

percentage of those living in poverty in Rhode

Island is 4.16% less than predicted, or in

other words, less than what this model predicts.

Similarly in D.C., the observed value is high up around 17%.

While the predicted value is much lower, around 11%, so

this time the residual is telling us something slightly different.

In this case, the percentage living in

poverty in D.C. is 5.44% more than predicted.

So the model overestimates the poverty level in

Rhode Island and underestimates the poverty level in DC.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.