国产香蕉精品视频一区,中文字幕第4页,日本tube8

import xgboost as xgbimport numpy as npimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_scorefrom sklearn.datasets import load_breast_cancer

2、xgboost

繼續采用乳腺癌數據集進行演示，該數據包含30個自變量與1個因變量，因變量表示癌癥的類型，為二分類數據，分別是良性、惡性。乳腺癌數據集在前期多次展示過，具體的數據可以通過幫助文檔查看，這里就不列了。

# 加載乳腺癌數據集cancer = load_breast_cancer()

# 劃分訓練集和測試集xtrain, xtest, ytrain, ytest = train_test_split(cancer.data, cancer.target,     random_state=123,test_size=0.3)print(xtrain.shape)

# 定義XGBoost模型model = xgb.XGBClassifier(    max_depth=3,    learning_rate=0.01,    n_estimators=200,    objective='binary:logistic')    # 訓練模型model.fit(xtrain, ytrain)# 計算準確率print(model.score(xtrain, ytrain)) #訓練性能print(model.score(xtest, ytest))  #泛化性能# print(model.feature_importances_) #特征重要性

# 另一種計算準確率的方法#y_pred = model.predict(xtest)# accuracy = accuracy_score(ytest, y_pred)# print(accuracy)

解讀：在這個例子中，我們使用XGBClassifier類定義了一個二分類的XGBoost模型，并設置了一些超參數，如max_depth表示樹的最大深度，learning_rate表示學習率，n_estimators表示樹的個數，objective表示損失函數。使用fit方法訓練模型，計算準確率并打印出來。

3、特征重要性的可視化

print(model.feature_importances_) #特征重要性

#自建繪圖函數

def plot_feature_importances_cancer(model): 

     n_features = cancer.data.shape[1] #獲取特征名稱 

     plt.barh(range(n_features), model.feature_importances_, align='center',color = 'gold')  #條形圖 

     plt.yticks(np.arange(n_features), cancer.feature_names) 

     plt.xlabel("Feature importance") 

     plt.ylabel("Feature")