赞
踩
Lasso回归(Least Absolute Shrinkage and Selection Operator)是一种线性回归的变种,通过对回归系数加上 范数惩罚项来解决多重共线性问题,并具有变量选择功能。
线性回归模型的目标是最小化以下目标函数:
在Lasso回归中,目标函数变为:
Lasso回归的目标函数可以拆分为两部分:误差项和惩罚项。通过添加惩罚项,可以避免过拟合,同时自动选择重要变量。
数据准备:
数据预处理:
模型训练:
模型评估:
模型优化:
- import numpy as np
- import matplotlib.pyplot as plt
- from sklearn.linear_model import Lasso
- from sklearn.metrics import mean_squared_error, r2_score
- from sklearn.model_selection import train_test_split
- from sklearn.preprocessing import StandardScaler
-
- # 生成示例数据
- np.random.seed(0)
- X = 2 * np.random.rand(100, 1)
- y = 4 + 3 * X + np.random.randn(100, 1) * 0.5
-
- # 数据分割为训练集和测试集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-
- # 数据标准化
- scaler = StandardScaler()
- X_train_scaled = scaler.fit_transform(X_train)
- X_test_scaled = scaler.transform(X_test)
-
- # 创建Lasso回归模型
- lasso_reg = Lasso(alpha=1.0)
- lasso_reg.fit(X_train_scaled, y_train)
-
- # 进行预测
- y_pred = lasso_reg.predict(X_test_scaled)
-
- # 模型评估
- mse = mean_squared_error(y_test, y_pred)
- r2 = r2_score(y_test, y_pred)
-
- print(f"Mean Squared Error: {mse}")
- print(f"R^2 Score: {r2}")
- print(f"Intercept: {lasso_reg.intercept_}")
- print(f"Coefficients: {lasso_reg.coef_}")
-
- # 可视化结果
- plt.scatter(X_test, y_test, color='blue', label='Actual')
- plt.plot(X_test, y_pred, color='red', label='Predicted')
- plt.xlabel("X")
- plt.ylabel("y")
- plt.title("Lasso Regression")
- plt.legend()
- plt.show()

- import numpy as np
- import matplotlib.pyplot as plt
- from sklearn.linear_model import Lasso
- from sklearn.metrics import mean_squared_error, r2_score
- from sklearn.model_selection import train_test_split
- from sklearn.preprocessing import StandardScaler
- from sklearn.linear_model import LassoCV
-
- # 创建带交叉验证的Lasso回归模型
- lasso_cv = LassoCV(alphas=np.logspace(-6, 6, 13), cv=5)
- lasso_cv.fit(X_train_scaled, y_train)
-
- # 进行预测
- y_pred_cv = lasso_cv.predict(X_test_scaled)
-
- # 模型评估
- mse_cv = mean_squared_error(y_test, y_pred_cv)
- r2_cv = r2_score(y_test, y_pred_cv)
-
- print(f"Best Alpha: {lasso_cv.alpha_}")
- print(f"Mean Squared Error (CV): {mse_cv}")
- print(f"R^2 Score (CV): {r2_cv}")
- print(f"Intercept (CV): {lasso_cv.intercept_}")
- print(f"Coefficients (CV): {lasso_cv.coef_}")
-
- # 可视化结果
- plt.scatter(X_test, y_test, color='blue', label='Actual')
- plt.plot(X_test, y_pred_cv, color='red', label='Predicted')
- plt.xlabel("X")
- plt.ylabel("y")
- plt.title("Lasso Regression with Cross-Validation")
- plt.legend()
- plt.show()

警告是由于 Lasso
和 LassoCV
期望 y
是一个一维数组(形状为 (n_samples,)
),而你的 y
是一个二维列向量(形状为 (n_samples, 1)
)。解决这个问题的方法是将 y
转换为一维数组。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。