老司机深夜福利ae入口网站,亚洲毛片一级巨乳,一级毛片在线视频

Lasso算法通過引入L1 正則化實現了變量選擇和模型稀疏化的目的，是處理高維數據和減少模型復雜度的一種有效方法，正則化參數的選擇至關重要，可以通過交叉驗證來優化。

代碼實現

數據導入處理

import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split df = pd.read_csv('Chabuhou.csv') # 劃分特征和目標變量 X = df.drop(['Electrical_cardioversion'], axis=1) y = df['Electrical_cardioversion'] # 劃分訓練集和測試集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=df['Electrical_cardioversion']) df.head()

讀取數據，將其分為特征（X）和目標變量（y），然后將數據集按80%訓練集和20%測試集進行劃分，使用的是一個心臟電復律的數據集包含46個特征變量一個目標變量為二分類任務，和前文特征選擇：基于隨機森林的Boruta算法應用為同一個數據集。

Lasso算法實現

from sklearn.preprocessing import StandardScaler from sklearn.linear_model import Lasso from sklearn.metrics import mean_squared_error # 標準化數據 scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # 訓練Lasso模型 lasso = Lasso(alpha=0.1) # alpha是正則化強度的參數 lasso.fit(X_train_scaled, y_train) # 打印Lasso模型的系數 coefficients = pd.Series(lasso.coef_, index=X.columns) selected_features = coefficients[coefficients != 0].index print("Selected features:") print(selected_features) # 計算測試集的均方誤差 y_pred = lasso.predict(X_test_scaled) mse = mean_squared_error(y_test, y_pred) print(f"\nMean Squared Error on Test Set: {mse}")

標準化數據后（標準化確保各特征在相同尺度上，使Lasso回歸在特征選擇時更加公平和有效）訓練一個Lasso回歸模型來選擇重要特征，并計算測試集上的均方誤差，當然這里只是人為隨意確定了一個正則化強度值；接下來可以使用交叉驗證來優化正則化參數 alpha，從而提高模型的性能。

交叉驗證優化正則化參數

from sklearn.linear_model import LassoCV from sklearn.model_selection import RepeatedKFold # 假設特征名存儲在 feature_names 列表中 feature_names = X.columns # 定義一組 alpha 值的范圍 alphas = np.logspace(-4, 0, 50) # 生成 50 個在 10^-4 到 10^0 之間的 alpha 值 # 使用交叉驗證的 LassoCV lasso_cv = LassoCV(alphas=alphas, cv=RepeatedKFold(n_splits=10, n_repeats=3, random_state=42), random_state=42) lasso_cv.fit(X_train_scaled, y_train) # 計算均方誤差路徑和標準差 mse_path = lasso_cv.mse_path_.mean(axis=1) # 每個 alpha 的均方誤差 mse_std = lasso_cv.mse_path_.std(axis=1) # 每個 alpha 的均方誤差的標準差 # 找到最佳 alpha 和 1-SE 規則的 alpha best_alpha_index = np.argmin(mse_path) # 最小均方誤差的索引 best_alpha = lasso_cv.alphas_[best_alpha_index] # 最佳 alpha 值 one_se_index = np.where(mse_path <= mse_path[best_alpha_index] + mse_std[best_alpha_index])[0][0] # 1-SE 規則的 alpha 索引 one_se_alpha = lasso_cv.alphas_[one_se_index] # 1-SE 規則的 alpha 值 # 打印最佳 alpha 值 print(f"Best alpha (λ_min): {best_alpha}") print(f"1-SE rule alpha (λ_1se): {one_se_alpha}") # 為兩個 alpha 值進行特征選擇 lasso_best_alpha = LassoCV(alphas=[best_alpha], cv=RepeatedKFold(n_splits=10, n_repeats=3, random_state=42), random_state=42) lasso_best_alpha.fit(X_train_scaled, y_train) selected_features_best = [feature_names[i] for i in np.where(lasso_best_alpha.coef_ != 0)[0]] # 獲取最佳 alpha 下的特征名 print(f"Selected features with λ_min: {selected_features_best}") # 打印 λ_min 下選擇的特征名 lasso_one_se_alpha = LassoCV(alphas=[one_se_alpha], cv=RepeatedKFold(n_splits=10, n_repeats=3, random_state=42), random_state=42) lasso_one_se_alpha.fit(X_train_scaled, y_train) selected_features_one_se = [feature_names[i] for i in np.where(lasso_one_se_alpha.coef_ != 0)[0]] # 獲取 1-SE 規則下的特征名 print(f"Selected features with λ_1se: {selected_features_one_se}") # 打印 λ_1se 下選擇的特征名 # 繪圖 plt.figure(figsize=(10, 6)) plt.errorbar(lasso_cv.alphas_, mse_path, yerr=mse_std, fmt='o', color='red', ecolor='gray', capsize=3) plt.axvline(lasso_cv.alphas_[best_alpha_index], linestyle='--', color='black', label=r'$\lambda_{min}$') plt.axvline(lasso_cv.alphas_[one_se_index], linestyle='--', color='blue', label=r'$\lambda_{1se}$') plt.xscale('log') # 使用對數刻度顯示 alpha 值 plt.xlabel('Alpha (α) value') plt.ylabel('Mean Squared Error (MSE)') plt.title('Lasso Regression: MSE vs Alpha (α) value') plt.xticks(rotation=45) plt.legend() plt.tight_layout()

使用Lasso回歸和交叉驗證（LassoCV）來選擇最優的正則化參數 alpha，并基于該參數進行特征選擇，目的是找到最小均方誤差對應的 alpha 值，以及應用1-SE規則找到更為保守的 alpha 值。

Lasso系數路徑圖

coefs = [] for a in alphas: lasso = Lasso(alpha=a, max_iter=10000) lasso.fit(X_train_scaled, y_train) coefs.append(lasso.coef_) # 繪制系數路徑 plt.figure(figsize=(10, 6)) ax = plt.gca() # 使用 log scale 顯示 alpha 值 ax.plot(np.log10(alphas), coefs) plt.xlabel('Log Lambda') plt.ylabel('Coefficients') plt.title('Lasso Paths') plt.axis('tight') plt.show()

Lasso路徑圖展示了Lasso算法如何通過調整正則化參數實現特征選擇（每條線代表一個特征的回歸系數）：隨著 alpha 值的增大，不重要的特征被逐漸排除，僅保留對目標預測有顯著影響的特征，使得模型更稀疏、更易解釋。

結合Boruta算法得到兩個模型特征篩選交集

import networkx as nx # 定義Boruta和Lasso選擇的特征 boruta_features = [ 'Type_of_atrial_fibrillation', 'BMI', 'Left_atrial_diameter', 'Systolic_blood_pressure', 'NtproBNP' ] lasso_features = [ 'Early_relapse', 'Late_relapse', 'Type_of_atrial_fibrillation', 'Sleep_apnea_syndrome', 'Heart_valve_disease', 'SGLT2i', 'B', 'BMI', 'Left_atrial_diameter', 'Systolic_blood_pressure', 'ALT', 'TSH' ] # 創建集合用于求交集 boruta_set = set(boruta_features) # Boruta特征集合 lasso_set = set(lasso_features) # Lasso特征集合 intersection = boruta_set.intersection(lasso_set) # 兩個集合的交集 boruta_only = boruta_set - intersection # 僅Boruta選擇的特征 lasso_only = lasso_set - intersection # 僅Lasso選擇的特征 # 創建圖 G = nx.Graph() # 添加節點和邊 for feature in boruta_only: G.add_edge('Boruta', feature, color='lightcoral') # 淡紅色表示僅被Boruta選擇的特征 for feature in lasso_only: G.add_edge('Lasso', feature, color='lightblue') # 淡藍色表示僅被Lasso選擇的特征 for feature in intersection: G.add_edge('Boruta', feature, color='lightcoral') # 淡紅色邊連接交集特征到Boruta G.add_edge('Lasso', feature, color='lightblue') # 淡藍色邊連接交集特征到Lasso # 獲取邊的顏色 edge_colors = [data['color'] for _, _, data in G.edges(data=True)] # 設置節點的顏色 node_colors = [] for node in G.nodes(): if node == 'Boruta': node_colors.append('lightcoral') # Boruta節點淡紅色 elif node == 'Lasso': node_colors.append('lightblue') # Lasso節點淡藍色 elif node in boruta_only: node_colors.append('lightcoral') # 僅被Boruta選擇的特征淡紅色 elif node in lasso_only: node_colors.append('lightblue') # 僅被Lasso選擇的特征淡藍色 elif node in intersection: node_colors.append('plum') # 交集特征節點用淡紫色表示 # 繪制圖形 plt.figure(figsize=(10, 10)) pos = nx.spring_layout(G, seed=42) # 使用spring布局 nx.draw_networkx( G, pos, edge_color=edge_colors, node_color=node_colors, with_labels=True, node_size=1000, font_size=10, edgecolors='none' # 移除節點邊框 ) plt.title('Feature Selection by Boruta and Lasso') plt.show()

這里通過網絡圖的形式直觀展示Boruta和Lasso在特征選擇上的差異和重疊情況，有助于理解這兩種方法各自偏好選擇的特征以及它們的共同選擇，當著重考慮其交集。

我們有何不同？

API服務商零注冊

多API并行試用

數據驅動選型，提升決策效率

查看全部API→

#AI文本生成大模型API

對比大模型API的內容創意新穎性、情感共鳴力、商業轉化潛力

25個渠道

一鍵對比試用API 限時免費

#AI深度推理大模型API

對比大模型API的邏輯推理準確性、分析深度、可視化建議合理性

10個渠道

一鍵對比試用API 限時免費

正則化項（L1 范數）

優勢

局限性

總結

代碼實現

數據導入處理

Lasso算法實現

交叉驗證優化正則化參數

Lasso系數路徑圖

結合Boruta算法得到兩個模型特征篩選交集

用圖表說話：如何有效呈現回歸預測模型結果

特征選擇：基于隨機森林的Boruta算法應用

我們有何不同？

熱門場景實測，選對API

#AI文本生成大模型API

#AI深度推理大模型API