逻辑回归模型——乳腺癌数据集
发布时间
阅读量:
阅读量
逻辑回归模型——乳腺癌数据集
# 导入数据集
from sklearn import datasets
import warnings
warnings.filterwarnings('ignore')
df = datasets.load_breast_cancer()
X = df.data
y = df.target
X.shape # 查看属性维度

X # 查看属性标签

y # 查看类别标签

# 划分训练集和测试集
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=666)

from sklearn.linear_model import LogisticRegression # 导入逻辑回归模型
from sklearn.preprocessing import StandardScaler # 归一化
from sklearn.preprocessing import PolynomialFeatures # 生成多项式
from sklearn.pipeline import Pipeline # Pipeline管道
# 直接用逻辑回归模型进行训练
logr = LogisticRegression()
logr.fit(X_train, y_train) # 训练
logr.score(X_test, y_test) # 测试

from sklearn.model_selection import GridSearchCV # 网格搜索
import numpy as np
# 使用网格搜索和Pipeline管道进行参数调优
pipe = Pipeline([
('Ploy', PolynomialFeatures()),
('Scaler', StandardScaler()),
('logr', LogisticRegression())
])
param_grid = [{
'Ploy__degree': [i for i in range(5)],
'logr__C': [i for i in np.arange(0.01,0.11,10)],
'logr__solver': ['liblinear']
}]
grid = GridSearchCV(pipe, param_grid=param_grid)
grid.fit(X_train, y_train)

grid.score(X_test, y_test)

对于模型的调优,由于时间关系,大家可以根据逻辑回归模型和PolynomialFeatures的参数进行调优。
全部评论 (0)
还没有任何评论哟~
