当前位置:   article > 正文

[机器学习]简单线性回归——最小二乘法

[机器学习]简单线性回归——最小二乘法

一.线性回归及最小二乘法概念 

2.代码实现 

  1. # 0.引入依赖
  2. import numpy as np
  3. import matplotlib.pyplot as plt
  4. # 1.导入数据
  5. points = np.genfromtxt('data.csv', delimiter=',')
  6. # points[0,0]
  7. # 提取points中的两列数据,分别作为x,y
  8. x = points[:, 0]
  9. y = points[:, 1]
  10. # 用plt画出散点图
  11. # plt.scatter(x, y)
  12. # plt.show()
  13. # 2.定义损失函数:最小平方损失函数
  14. # 损失函数是系数的函数,另外还要传入数据的x,y
  15. def compute_cost(w, b, points):
  16. total_cost = 0
  17. M = len(points)
  18. # 逐点计算平方损失误差,然后求平均数
  19. for i in range(M):
  20. x = points[i, 0]
  21. y = points[i, 1]
  22. total_cost += (y - w * x - b) ** 2
  23. return total_cost / M
  24. # 3.定义算法拟合函数
  25. # 先定义一个求均值的函数
  26. def average(data):
  27. sum = 0
  28. num = len(data)
  29. for i in range(num):
  30. sum += data[i]
  31. return sum / num
  32. # 定义核心拟合函数
  33. def fit(points):
  34. M = len(points)
  35. x_bar = average(points[:, 0])
  36. sum_yx = 0
  37. sum_x2 = 0
  38. sum_delta = 0
  39. for i in range(M):
  40. x = points[i, 0]
  41. y = points[i, 1]
  42. sum_yx += y * (x - x_bar)
  43. sum_x2 += x ** 2
  44. # 根据公式计算w
  45. w = sum_yx / (sum_x2 - M * (x_bar ** 2))
  46. for i in range(M):
  47. x = points[i, 0]
  48. y = points[i, 1]
  49. sum_delta += (y - w * x)
  50. b = sum_delta / M
  51. return w, b
  52. # 4.测试
  53. w, b = fit(points)
  54. print("w is: ", w)
  55. print("b is: ", b)
  56. cost = compute_cost(w, b, points)
  57. print("cost is: ", cost)
  58. # 5.画出拟合曲线
  59. plt.scatter(x, y)
  60. # 针对每一个x,计算出预测的y值
  61. pred_y = w * x + b
  62. plt.plot(x, pred_y, c='r')
  63. plt.show()
  1. import numpy as np
  2. import matplotlib.pyplot as plt
  3. from sklearn.linear_model import LinearRegression # sklearn库实现
  4. # 1. 导入数据(data.csv)
  5. points = np.genfromtxt('data.csv', delimiter=',')
  6. points[0,0]
  7. # 提取points中的两列数据,分别作为x,y
  8. x = points[:, 0]
  9. y = points[:, 1]
  10. # 用plt画出散点图
  11. # plt.scatter(x, y)
  12. # plt.show()
  13. # 2. 定义损失函数:最小平方损失函数
  14. # 损失函数是系数的函数,另外还要传入数据的x,y
  15. def compute_cost(w, b, points):
  16. total_cost = 0
  17. M = len(points)
  18. # 逐点计算平方损失误差,然后求平均数
  19. for i in range(M):
  20. x = points[i, 0]
  21. y = points[i, 1]
  22. total_cost += (y - w * x - b) ** 2
  23. return total_cost / M
  24. lr = LinearRegression()
  25. x_new = x.reshape(-1, 1) # 将1行数据变为二维数组
  26. y_new = y.reshape(-1, 1)
  27. lr.fit(x_new, y_new)
  28. # 3. 从训练好的模型中提取系数和截距:使用的也是最小二乘法
  29. w = lr.coef_[0][0]
  30. b = lr.intercept_[0]
  31. print("w is: ", w)
  32. print("b is: ", b)
  33. cost = compute_cost(w, b, points)
  34. print("cost is: ", cost)
  35. plt.scatter(x, y)
  36. # 针对每一个x,计算出预测的y值
  37. pred_y = w * x + b
  38. plt.plot(x, pred_y, c='r')
  39. plt.show()

w is:  1.3224310227553846
b is:  7.991020982269173
cost is:  110.25738346621313

3.代码及数据下载

 简单线性回归-最小二乘法资源-CSDN文库

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/article/detail/45012
推荐阅读
相关标签
  

闽ICP备14008679号