赞
踩
简单线性回归模型是用于估计一个连续预测变量和一个连续回应变量的线性关系。
回归方程或估计回归方程(estimated regression equation,ERE):
y~=b0+b1*x
其中:
.y~是回应变量的估计值
.b0是回归线在y轴上的截距
.b1是回归线的斜率
.b0和b1称为回归系数
<span style="font-size:18px;">sugar<-read.table(file="/LabData/RData/regression/nutrition.txt",header=TRUE)</span>
<span style="font-size:18px;">edit(sugar)</span>
<span style="font-size:18px;">> plot(data=sugar,rating~sugars,main="营养级别和含糖量的散点图及拟合线",xlab="含糖量",ylab="营养级别")
> lm.reg<-lm(data=sugar,rating~sugars)
> abline(lm.reg,lty=4,lwd=3)</span>
<span style="font-size:18px;">> lm(data=sugar,rating~sugars)
Call:
lm(formula = rating ~ sugars, data = sugar)
Coefficients:
(Intercept) sugars
59.284 -2.401 </span><span style="font-size: 18px; font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);"> </span>
其中Sx和Sy分别代表样本x和y的标准差,相关系数r的取值范围为:(-1,1)
变量r的值越接近于1,表明二者正向相关性越大,随着x增大y也会增大
变量r的值越接近于-1,表明二者负向相关性越大,随着x增大y会减小
2. 方差分析表(ANOVA table)
一般形式如下:
> anova<-aov(data=sugar,rating~sugars)
> summary(anova)
Df Sum Sq Mean Sq F value Pr(>F)
sugars 1 8655 8655 102.3 1.15e-15
Residuals 75 6342 85
sugars ***
Residuals
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> lm.reg<-lm(data=sugar,rating~sugars)
> summary(lm.reg)
Call:
lm(formula = rating ~ sugars, data = sugar)
Residuals:
Min 1Q Median 3Q Max
-17.853 -5.677 -1.439 5.160 34.421
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 59.2844 1.9485 30.43 < 2e-16 ***
sugars -2.4008 0.2373 -10.12 1.15e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.196 on 75 degrees of freedom
Multiple R-squared: 0.5771, Adjusted R-squared: 0.5715
F-statistic: 102.3 on 1 and 75 DF, p-value: 1.153e-15

> lm.reg<-lm(data=sugar,rating~sugars)
> #level=0.95为置信度
> confint(lm.reg,level=0.95)
2.5 % 97.5 %
(Intercept) 55.402783 63.165952
sugars -2.873567 -1.92807
> point<-data.frame(sugars=10)
> point
sugars
1 10
> lm.reg
Call:
lm(formula = rating ~ sugars, data = sugar)
Coefficients:
(Intercept) sugars
59.284 -2.401
> lm.pred<-predict(lm.reg,point,interval="prediction",level=0.95)
> lm.pred
fit lwr upr
1 35.27617 16.7815 53.77083

可以看到在95%的置信度下,含糖量为10的谷物其营养级别介于 16.7815和53.77083之间
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。