K-Means算法与模糊C-Means聚类算法
文章目录
-
-
一、数据集描述
-
- 1、数据来源
- 2、数据维度
-
二、K-Means算法实现聚类
-
- 1、基本思想
- 2、算法描述
- 3、代码实现
-
三、模糊C-Means聚类算法实现聚类
-
- 1、基本思想
- 2、算法描述
- 3、代码实现
-
四、K-Means算法与模糊C-Means聚类算法结果对比
-
五、算法比较
-
一、数据集描述
实验数据集 :4898个白葡萄酒样本
1、数据来源
链接 :Wine Quality Data。(其中包含了1599个红葡萄酒样本和4898个白葡萄酒样本,在本次实验中我们仅采用了白葡萄酒样本的数据)
数据来源 :UCI Machine Learning Repository(University of California, Irvine, School of Information and Computer Sciences)
2、数据维度
维度 :12个变量
| 序号 | 理化性质 | 字段名称 |
|---|---|---|
| 1 | 固定酸度 | fixed acidity |
| 2 | 挥发性酸度 | volatile acidity |
| 3 | 柠檬酸 | fixed acidity |
| 4 | 残糖 | residual sugar |
| 5 | 氯化物 | chlorides |
| 6 | 游离二氧化硫 | free sulfur dioxide |
| 7 | 总二氧化硫 | total sulfur dioxide |
| 8 | 密度 | density |
| 9 | PH值 | pH |
| 10 | 硫酸盐 | sulphates |
| 11 | 酒精度 | alcohol |
| 12 | 质量 | quality |
其中葡萄酒的固有特性包括固定性酸度(即固定酸度)、挥发性酸度(即挥发性酸)、柠檬酸、剩余糖分、氯化物以及游离亚硫酸等具体指标作为影响因素;而葡萄酒的质量则由感官评价确定作为品质指标的标准数值体系
二、K-Means算法实现聚类
1、基本思想
基于空间中的k个中心点开展聚类,并将最接近这些中心的对象归入相应的类别。采用迭代的方式逐步调整各聚类中心的值,并最终实现最佳的聚类效果。
2、算法描述
- 生成K个初始质心点,并通过随机选择的方式确定它们的位置。
- 每当某个数据点所属的簇发生变化时:
- 计算每个数据点与各质心之间的距离
- 将该数据点归入最近的那个簇
- 对于每一个簇而言,在重新计算所有成员的平均位置后确定新的质心位置
- 持续执行这一过程直至所有数据点所属的簇不再变化或达到预设的最大迭代次数
3、代码实现
本代码中运用欧式距离法求取数据点与质心之间的距离。
由于样本变量具有12个维度,在高维空间中难以直观呈现分析结果,因此最终分析结果以数据形式呈现。
% 读取数据集,变量data存储的是一个尺寸为4898×12的矩阵,第12列是标签
data=xlsread('wine-white.xls');
% 将样本特征和样本标签分类存放
feature = data(:, 1:10);
label = data(:, 11);
% 对样本特征进行0-1归一化操作
flattened_feature = feature(:)';
mapped_flattened_feature = mapminmax(flattened_feature, 0, 1);
feature = reshape(mapped_flattened_feature, size(feature));
% 变量K中存放的是类别数,本例中K=5
K = 5;
% 从变量feature中随机挑选K个样本作为初始簇中心
% 变量data_num中存放的是样本数量
% 变量temp中存放的是随机产生的K个序号
% 变量center中存放的是挑选出的K个簇中心
data_num = size(feature, 1);
temp = randperm(data_num, K)';
center = feature(temp, :);
% 变量iteration中存放的是迭代次数
iteration = 0;
% 开始迭代
while 1
% 变量distance中存放的是样本特征集与所有簇中心的欧氏距离的平方
% 它是一个M×K的矩阵,M是样本量,K是类别数
distance = ou_distance(feature, center);
% 对变量distance的每一行从小到大排序,变量index中存放的是排序后的序号
[~, index] = sort(distance, 2, 'ascend');
% 计算新的簇中心
center_new = zeros(K, size(feature, 2));
for i = 1:K
class_i_feature = feature(index(:, 1) == i, :);
center_new(i, :) = mean(class_i_feature, 1);
end
% 更新迭代次数
iteration = iteration + 1;
% 输出当前迭代次数
fprintf('当前迭代次数为:%d\n', iteration);
% 如果聚类中心与上一次迭代相同,则停止迭代,跳出循环
if center_new == center
break;
end
% 否则用新的簇中心来取代旧的
center = center_new;
end
% 变量result中存放的是最终的聚类结果
result = index(:, 1);
c

结果:


聚类中心矩阵:

部分聚类结果与样本标签对比:


每个编号从一到五代表了不同聚类后的分类。基于葡萄酒的质量区间从4到8划分数据集中的葡萄酒样本,并将其划分为五个类别。结果显示各分类对应的品质等级分别为:编号1对应品质等级4;编号2对应品质等级5;编号3对应品质等级6;编号4对应品质等级7;编号5对应品质等级8。
三、模糊C-Means聚类算法实现聚类
1、基本思想
FCM算法是一种基于隶属度计算每个数据点所属各个类别程度的聚类方法。
2、算法描述
- 对数据进行规范化处理;
- 构建模糊相似关系;初始化模糊隶属度矩阵;
- 启动迭代过程;持续更新模糊隶属度矩阵;直至目标函数达到最小值;
- 基于迭代终止的结果;通过最终成员度分布确定数据分类归属;展示聚类分析的结果。
3、代码实现
主函数:
function[center,U,obj_fun]=FCMCluster(data,n,options)
%输入
% data n*m矩阵,n个样本数,每个样本的维度为m
% n 类别数
% options 4*1 矩阵
% options(1):隶属度矩阵U的加权指数
% options(2):最大迭代次数
% options(3):隶属度最小变化量,迭代终止条件
% options(4):每次迭代是否输出信息标志
%输出
% center 聚类中心
% U 隶属度矩阵
% obj_fun 目标函数值
if nargin~=2 && nargin~=3
error('Too many or too few input arguments');
end
data_n=size(data,1);
in_n=size(data,2);
%默认参数
default_options=[2;100;1e-5;1];
%参数配置
%如果只输入前两个参数,选用默认的参数;如果参数个数小于4,其他选用默认参数
if nargin==2
options=default_options;
else
if length(options)<4
tmp=default_options;
tmp(1:length(options))=options;
options=tmp;
end
nan_index=find(isnan(options)==1);
options(nan_index)=default_options(nan_index);
if options(1)<=1
error('The exponent should be greater than 1!');
end
end
%将options 中的分量分别赋值给四个变量
expo=options(1);
max_iter=options(2);
min_impro=options(3);
display=options(4);
obj_fun=zeros(max_iter,1);
%初始化模糊分配矩阵
U=initfcm(n,data_n);
%主程序
for i=1:max_iter
[U,center,obj_fun(i)]=stepfcm(data,U,n,expo);
if display
fprintf('FCM:Iteration count=%d,obj_fun=%f\n',i,obj_fun(i));
end
%终止条件判别
if i>1
if abs(obj_fun(i)-obj_fun(i-1))<min_impro
break;
end
end
end
iter_n=i;
obj_fun(iter_n+1:max_iter)=[];
end
%%子函数 模糊矩阵初始化
function U= initfcm(n,data_n)
U=rand(n,data_n);
col_sum=sum(U);
U=U./col_sum(ones(n,1),:);
end
%%子函数 逐步聚类
function [U_new,center,obj_fun]=stepfcm(data,U,n,expo)
mf=U.^expo;
center=mf*data./((ones(size(data,2),1)*sum(mf'))');
dist=distfcm(center,data);
obj_fun=sum(sum((dist.^2).*mf));
tmp=dist.^(-2/(expo-1));
U_new=tmp./(ones(n,1)*sum(tmp));
end
%%子函数 计算距离
function out=distfcm(center,data)
out=zeros(size(center,1),size(data,1));
for k=1:size(center,1)
out(k,:)=sqrt(sum(((data-ones(size(data,1),1)*center(k,:)).^2)',1));
end
end
c

导入数据,显示聚类结果:
data=xlsread('wine-white.xls');
options = [2;100;1e-5;1];
[center,U,obj_fcn] = FCMCluster(data,5,options);
plot(data(:,1),data(:,2),'o');
hold on;
index1=find(U(1,:)==max(U));%找出划分为第一类的数据索引
index2=find(U(2,:)==max(U));%找出划分为第二类的数据索引
index3=find(U(3,:)==max(U));%找出划分为第三类的数据索引
index4=find(U(4,:)==max(U));%找出划分为第四类的数据索引
index5=find(U(5,:)==max(U));%找出划分为第五类的数据索引
c

5个变量index分别存储聚类后每个样本所属的类
1次:
部分样本聚类结果:




输出结果:
FCM:Iteration count=1,obj_fun=2743902.770422
FCM:Iteration count=2,obj_fun=2076323.760379
FCM:Iteration count=3,obj_fun=2051421.749670
FCM:Iteration count=4,obj_fun=1908660.289224
FCM:Iteration count=5,obj_fun=1582373.766693
FCM:Iteration count=6,obj_fun=1295957.927634
FCM:Iteration count=7,obj_fun=1132566.776316
FCM:Iteration count=8,obj_fun=1051959.473996
FCM:Iteration count=9,obj_fun=1012192.751587
FCM:Iteration count=10,obj_fun=991833.196114
FCM:Iteration count=11,obj_fun=980955.638748
FCM:Iteration count=12,obj_fun=974959.961651
FCM:Iteration count=13,obj_fun=971595.185669
FCM:Iteration count=14,obj_fun=969689.571113
FCM:Iteration count=15,obj_fun=968604.944041
FCM:Iteration count=16,obj_fun=967984.966835
FCM:Iteration count=17,obj_fun=967628.557690
FCM:Iteration count=18,obj_fun=967421.889673
FCM:Iteration count=19,obj_fun=967300.512265
FCM:Iteration count=20,obj_fun=967227.940078
FCM:Iteration count=21,obj_fun=967183.507021
FCM:Iteration count=22,obj_fun=967155.483395
FCM:Iteration count=23,obj_fun=967137.184327
FCM:Iteration count=24,obj_fun=967124.774214
FCM:Iteration count=25,obj_fun=967116.029798
FCM:Iteration count=26,obj_fun=967109.643648
FCM:Iteration count=27,obj_fun=967104.831701
FCM:Iteration count=28,obj_fun=967101.111717
FCM:Iteration count=29,obj_fun=967098.177772
FCM:Iteration count=30,obj_fun=967095.828775
FCM:Iteration count=31,obj_fun=967093.927415
FCM:Iteration count=32,obj_fun=967092.376320
FCM:Iteration count=33,obj_fun=967091.103984
FCM:Iteration count=34,obj_fun=967090.056289
FCM:Iteration count=35,obj_fun=967089.191263
FCM:Iteration count=36,obj_fun=967088.475728
FCM:Iteration count=37,obj_fun=967087.883082
FCM:Iteration count=38,obj_fun=967087.391773
FCM:Iteration count=39,obj_fun=967086.984211
FCM:Iteration count=40,obj_fun=967086.645966
FCM:Iteration count=41,obj_fun=967086.365156
FCM:Iteration count=42,obj_fun=967086.131971
FCM:Iteration count=43,obj_fun=967085.938299
FCM:Iteration count=44,obj_fun=967085.777421
FCM:Iteration count=45,obj_fun=967085.643770
FCM:Iteration count=46,obj_fun=967085.532728
FCM:Iteration count=47,obj_fun=967085.440464
FCM:Iteration count=48,obj_fun=967085.363797
FCM:Iteration count=49,obj_fun=967085.300089
FCM:Iteration count=50,obj_fun=967085.247146
FCM:Iteration count=51,obj_fun=967085.203147
FCM:Iteration count=52,obj_fun=967085.166581
FCM:Iteration count=53,obj_fun=967085.136191
FCM:Iteration count=54,obj_fun=967085.110932
FCM:Iteration count=55,obj_fun=967085.089939
FCM:Iteration count=56,obj_fun=967085.072490
FCM:Iteration count=57,obj_fun=967085.057987
FCM:Iteration count=58,obj_fun=967085.045932
FCM:Iteration count=59,obj_fun=967085.035912
FCM:Iteration count=60,obj_fun=967085.027583
FCM:Iteration count=61,obj_fun=967085.020659
FCM:Iteration count=62,obj_fun=967085.014904
FCM:Iteration count=63,obj_fun=967085.010121
FCM:Iteration count=64,obj_fun=967085.006144
FCM:Iteration count=65,obj_fun=967085.002839
FCM:Iteration count=66,obj_fun=967085.000091
FCM:Iteration count=67,obj_fun=967084.997807
FCM:Iteration count=68,obj_fun=967084.995908
FCM:Iteration count=69,obj_fun=967084.994329
FCM:Iteration count=70,obj_fun=967084.993017
FCM:Iteration count=71,obj_fun=967084.991926
FCM:Iteration count=72,obj_fun=967084.991019
FCM:Iteration count=73,obj_fun=967084.990266
FCM:Iteration count=74,obj_fun=967084.989639
FCM:Iteration count=75,obj_fun=967084.989118
FCM:Iteration count=76,obj_fun=967084.988685
FCM:Iteration count=77,obj_fun=967084.988325
FCM:Iteration count=78,obj_fun=967084.988026
FCM:Iteration count=79,obj_fun=967084.987777
FCM:Iteration count=80,obj_fun=967084.987570
FCM:Iteration count=81,obj_fun=967084.987398
FCM:Iteration count=82,obj_fun=967084.987255
FCM:Iteration count=83,obj_fun=967084.987136
FCM:Iteration count=84,obj_fun=967084.987038
FCM:Iteration count=85,obj_fun=967084.986955
FCM:Iteration count=86,obj_fun=967084.986887
FCM:Iteration count=87,obj_fun=967084.986830
FCM:Iteration count=88,obj_fun=967084.986783
FCM:Iteration count=89,obj_fun=967084.986744
FCM:Iteration count=90,obj_fun=967084.986711
FCM:Iteration count=91,obj_fun=967084.986684
FCM:Iteration count=92,obj_fun=967084.986662
FCM:Iteration count=93,obj_fun=967084.986643
FCM:Iteration count=94,obj_fun=967084.986627
FCM:Iteration count=95,obj_fun=967084.986614
FCM:Iteration count=96,obj_fun=967084.986604
FCM:Iteration count=97,obj_fun=967084.986595
聚类中心:

时间:

2次:
部分样本聚类结果:





输出结果:
FCM:Iteration count=1,obj_fun=2753544.591951
FCM:Iteration count=2,obj_fun=2076038.796116
FCM:Iteration count=3,obj_fun=2048870.380586
FCM:Iteration count=4,obj_fun=1890841.533324
FCM:Iteration count=5,obj_fun=1532261.998775
FCM:Iteration count=6,obj_fun=1244917.031703
FCM:Iteration count=7,obj_fun=1099510.682752
FCM:Iteration count=8,obj_fun=1029529.264145
FCM:Iteration count=9,obj_fun=996681.841108
FCM:Iteration count=10,obj_fun=981883.001456
FCM:Iteration count=11,obj_fun=974961.651558
FCM:Iteration count=12,obj_fun=971483.755696
FCM:Iteration count=13,obj_fun=969624.643901
FCM:Iteration count=14,obj_fun=968590.742682
FCM:Iteration count=15,obj_fun=968001.719130
FCM:Iteration count=16,obj_fun=967660.072525
FCM:Iteration count=17,obj_fun=967458.312698
FCM:Iteration count=18,obj_fun=967336.571884
FCM:Iteration count=19,obj_fun=967261.129006
FCM:Iteration count=20,obj_fun=967212.856436
FCM:Iteration count=21,obj_fun=967180.830213
FCM:Iteration count=22,obj_fun=967158.756531
FCM:Iteration count=23,obj_fun=967142.964807
FCM:Iteration count=24,obj_fun=967131.278313
FCM:Iteration count=25,obj_fun=967122.377446
FCM:Iteration count=26,obj_fun=967115.439714
FCM:Iteration count=27,obj_fun=967109.935298
FCM:Iteration count=28,obj_fun=967105.510156
FCM:Iteration count=29,obj_fun=967101.918555
FCM:Iteration count=30,obj_fun=967098.983597
FCM:Iteration count=31,obj_fun=967096.573701
FCM:Iteration count=32,obj_fun=967094.588264
FCM:Iteration count=33,obj_fun=967092.948664
FCM:Iteration count=34,obj_fun=967091.592415
FCM:Iteration count=35,obj_fun=967090.469234
FCM:Iteration count=36,obj_fun=967089.538289
FCM:Iteration count=37,obj_fun=967088.766215
FCM:Iteration count=38,obj_fun=967088.125615
FCM:Iteration count=39,obj_fun=967087.593926
FCM:Iteration count=40,obj_fun=967087.152521
FCM:Iteration count=41,obj_fun=967086.785998
FCM:Iteration count=42,obj_fun=967086.481607
FCM:Iteration count=43,obj_fun=967086.228783
FCM:Iteration count=44,obj_fun=967086.018769
FCM:Iteration count=45,obj_fun=967085.844301
FCM:Iteration count=46,obj_fun=967085.699351
FCM:Iteration count=47,obj_fun=967085.578919
FCM:Iteration count=48,obj_fun=967085.478850
FCM:Iteration count=49,obj_fun=967085.395698
FCM:Iteration count=50,obj_fun=967085.326600
FCM:Iteration count=51,obj_fun=967085.269179
FCM:Iteration count=52,obj_fun=967085.221459
FCM:Iteration count=53,obj_fun=967085.181800
FCM:Iteration count=54,obj_fun=967085.148839
FCM:Iteration count=55,obj_fun=967085.121445
FCM:Iteration count=56,obj_fun=967085.098677
FCM:Iteration count=57,obj_fun=967085.079753
FCM:Iteration count=58,obj_fun=967085.064023
FCM:Iteration count=59,obj_fun=967085.050949
FCM:Iteration count=60,obj_fun=967085.040082
FCM:Iteration count=61,obj_fun=967085.031049
FCM:Iteration count=62,obj_fun=967085.023541
FCM:Iteration count=63,obj_fun=967085.017300
FCM:Iteration count=64,obj_fun=967085.012112
FCM:Iteration count=65,obj_fun=967085.007799
FCM:Iteration count=66,obj_fun=967085.004214
FCM:Iteration count=67,obj_fun=967085.001234
FCM:Iteration count=68,obj_fun=967084.998757
FCM:Iteration count=69,obj_fun=967084.996698
FCM:Iteration count=70,obj_fun=967084.994986
FCM:Iteration count=71,obj_fun=967084.993563
FCM:Iteration count=72,obj_fun=967084.992380
FCM:Iteration count=73,obj_fun=967084.991397
FCM:Iteration count=74,obj_fun=967084.990579
FCM:Iteration count=75,obj_fun=967084.989900
FCM:Iteration count=76,obj_fun=967084.989335
FCM:Iteration count=77,obj_fun=967084.988865
FCM:Iteration count=78,obj_fun=967084.988475
FCM:Iteration count=79,obj_fun=967084.988150
FCM:Iteration count=80,obj_fun=967084.987880
FCM:Iteration count=81,obj_fun=967084.987656
FCM:Iteration count=82,obj_fun=967084.987470
FCM:Iteration count=83,obj_fun=967084.987315
FCM:Iteration count=84,obj_fun=967084.987186
FCM:Iteration count=85,obj_fun=967084.987079
FCM:Iteration count=86,obj_fun=967084.986990
FCM:Iteration count=87,obj_fun=967084.986916
FCM:Iteration count=88,obj_fun=967084.986854
FCM:Iteration count=89,obj_fun=967084.986803
FCM:Iteration count=90,obj_fun=967084.986760
FCM:Iteration count=91,obj_fun=967084.986725
FCM:Iteration count=92,obj_fun=967084.986696
FCM:Iteration count=93,obj_fun=967084.986671
FCM:Iteration count=94,obj_fun=967084.986651
FCM:Iteration count=95,obj_fun=967084.986634
FCM:Iteration count=96,obj_fun=967084.986620
FCM:Iteration count=97,obj_fun=967084.986608
FCM:Iteration count=98,obj_fun=967084.986598
聚类中心:


)
3次:
部分样本聚类结果:





FCM在第1次迭代时的目标函数值为-5;经过逐步优化计算得知,在第5次迭代时目标函数值达到最低点-6;随后随着迭代次数的增加目标函数值呈现波动趋势但整体呈下降趋势直至稳定在约-6水平;经过详细计算发现当迭代次数达到58次时目标函数值首次低于-6并持续下降直至稳定;最终经过86次迭代目标函数值稳定于-6
聚类中心:


4次:
部分样本聚类结果:





输出结果:
FCM:Iteration count=1,obj_fun=2753857.668739
FCM:Iteration count=2,obj_fun=2076176.569146
FCM:Iteration count=3,obj_fun=2050461.913577
FCM:Iteration count=4,obj_fun=1907009.977523
FCM:Iteration count=5,obj_fun=1584620.764112
FCM:Iteration count=6,obj_fun=1321169.930421
FCM:Iteration count=7,obj_fun=1197360.224126
FCM:Iteration count=8,obj_fun=1121628.918434
FCM:Iteration count=9,obj_fun=1071565.886965
FCM:Iteration count=10,obj_fun=1038851.685429
FCM:Iteration count=11,obj_fun=1012320.862071
FCM:Iteration count=12,obj_fun=994904.196648
FCM:Iteration count=13,obj_fun=985389.764491
FCM:Iteration count=14,obj_fun=980011.164248
FCM:Iteration count=15,obj_fun=976643.159496
FCM:Iteration count=16,obj_fun=974354.922836
FCM:Iteration count=17,obj_fun=972717.675964
FCM:Iteration count=18,obj_fun=971506.876741
FCM:Iteration count=19,obj_fun=970590.821903
FCM:Iteration count=20,obj_fun=969885.980617
FCM:Iteration count=21,obj_fun=969336.537549
FCM:Iteration count=22,obj_fun=968903.784999
FCM:Iteration count=23,obj_fun=968560.100223
FCM:Iteration count=24,obj_fun=968285.308165
FCM:Iteration count=25,obj_fun=968064.386248
FCM:Iteration count=26,obj_fun=967885.965949
FCM:Iteration count=27,obj_fun=967741.326257
FCM:Iteration count=28,obj_fun=967623.700423
FCM:Iteration count=29,obj_fun=967527.787643
FCM:Iteration count=30,obj_fun=967449.402085
FCM:Iteration count=31,obj_fun=967385.216052
FCM:Iteration count=32,obj_fun=967332.569003
FCM:Iteration count=33,obj_fun=967289.323558
FCM:Iteration count=34,obj_fun=967253.755587
FCM:Iteration count=35,obj_fun=967224.469433
FCM:Iteration count=36,obj_fun=967200.331924
FCM:Iteration count=37,obj_fun=967180.420603
FCM:Iteration count=38,obj_fun=967163.982844
FCM:Iteration count=39,obj_fun=967150.403358
FCM:Iteration count=40,obj_fun=967139.178263
FCM:Iteration count=41,obj_fun=967129.894273
FCM:Iteration count=42,obj_fun=967122.211951
FCM:Iteration count=43,obj_fun=967115.852176
FCM:Iteration count=44,obj_fun=967110.585177
FCM:Iteration count=45,obj_fun=967106.221631
FCM:Iteration count=46,obj_fun=967102.605404
FCM:Iteration count=47,obj_fun=967099.607636
FCM:Iteration count=48,obj_fun=967097.121905
FCM:Iteration count=49,obj_fun=967095.060262
FCM:Iteration count=50,obj_fun=967093.349987
FCM:Iteration count=51,obj_fun=967091.930920
FCM:Iteration count=52,obj_fun=967090.753270
FCM:Iteration count=53,obj_fun=967089.775809
FCM:Iteration count=54,obj_fun=967088.964390
FCM:Iteration count=55,obj_fun=967088.290719
FCM:Iteration count=56,obj_fun=967087.731344
FCM:Iteration count=57,obj_fun=967087.266824
FCM:Iteration count=58,obj_fun=967086.881035
FCM:Iteration count=59,obj_fun=967086.560605
FCM:Iteration count=60,obj_fun=967086.294440
FCM:Iteration count=61,obj_fun=967086.073332
FCM:Iteration count=62,obj_fun=967085.889643
FCM:Iteration count=63,obj_fun=967085.737030
FCM:Iteration count=64,obj_fun=967085.610229
FCM:Iteration count=65,obj_fun=967085.504869
FCM:Iteration count=66,obj_fun=967085.417321
FCM:Iteration count=67,obj_fun=967085.344569
FCM:Iteration count=68,obj_fun=967085.284112
FCM:Iteration count=69,obj_fun=967085.233869
FCM:Iteration count=70,obj_fun=967085.192114
FCM:Iteration count=71,obj_fun=967085.157412
FCM:Iteration count=72,obj_fun=967085.128570
FCM:Iteration count=73,obj_fun=967085.104599
FCM:Iteration count=74,obj_fun=967085.084675
FCM:Iteration count=75,obj_fun=967085.068115
FCM:Iteration count=76,obj_fun=967085.054350
FCM:Iteration count=77,obj_fun=967085.042909
FCM:Iteration count=78,obj_fun=967085.033399
FCM:Iteration count=79,obj_fun=967085.025494
FCM:Iteration count=80,obj_fun=967085.018923
FCM:Iteration count=81,obj_fun=967085.013461
FCM:Iteration count=82,obj_fun=967085.008921
FCM:Iteration count=83,obj_fun=967085.005147
FCM:Iteration count=84,obj_fun=967085.002010
FCM:Iteration count=85,obj_fun=967084.999402
FCM:Iteration count=86,obj_fun=967084.997234
FCM:Iteration count=87,obj_fun=967084.995432
FCM:Iteration count=88,obj_fun=967084.993933
FCM:Iteration count=89,obj_fun=967084.992688
FCM:Iteration count=90,obj_fun=967084.991653
FCM:Iteration count=91,obj_fun=967084.990792
FCM:Iteration count=92,obj_fun=967084.990077
FCM:Iteration count=93,obj_fun=967084.989482
FCM:Iteration count=94,obj_fun=967084.988987
FCM:Iteration count=95,obj_fun=967084.988576
FCM:Iteration count=96,obj_fun=967084.988235
FCM:Iteration count=97,obj_fun=967084.987951
FCM:Iteration count=98,obj_fun=967084.987714
FCM:Iteration count=99,obj_fun=967084.987518
FCM:Iteration count=100,obj_fun=967084.987355
聚类中心:


5次:
部分样本聚类结果:





FCM:迭代次数为1时的目标函数值为-6。
FCM:当迭代次数增加到-6时目标函数值减少至-6。
FCM:随着迭代次数从-6增至-6目标函数值逐步下降。
...
聚类中心:


分析:
经过多次实验后,在每次聚类结果中都表现出明显差别(即每次得到的类别分布有较大的变化),但各聚类中心之间的相对接近程度较高(即各个类别之间的区域重叠较明显)。模糊C均值聚类方法通过隶属度指标量化地表样本对各类别的归属程度(即用数值大小表示样本对各类别的归属强度)。相对于仅取二元归属的K-Means算法而言,在分类精度上具有一定的优势(即分类效果更好)。
四、K-Means算法与模糊C-Means聚类算法结果对比
| 聚类方法 | 时间 | 平均准确度(估算) |
|---|---|---|
| K-Means聚类 | 1s左右 | 50% |
| 模糊C均值聚类 | 2s左右 | 57% |
五、算法比较
k均值聚类:一种硬划分的无监督学习算法,在机器学习领域具有重要地位。该方法基于 prototype-based 的目标函数式聚类方法体系展开研究,并将数据对象划分为若干个互不相交的类别群组(即隶属度为0或1)。其核心依据是通过最小化各类别内部样本与类别中心点之间的误差平方和来实现优化;K-Means 聚类算法的主要优势体现在:
- 该算法具有高效性和直观性;
- 在处理大规模数据时表现出高效率,并且具备良好的扩展性;
- 时间复杂度接近线性,并且能够有效地适用于大规模数据挖掘任务。K-Means聚类算法的时间复杂度为O(nkt), 其中n表示数据集中对象的数量,t代表算法迭代的次数,k代表簇的数目。
K-Means聚类算法的缺点:
- 基于初始聚类中心进行划分是一种常用策略.若初值选取不当,可能导致聚类效果不佳;
- 算法需要持续不断地进行样本分类调整,并持续地重新计算新的聚类中心位置,在数据规模非常大的情况下,该过程将会耗时较多;
- 在K-means算法中, K值是预先设定好的参数,在实际应用中选择合适的K值具有较高的难度。
该模糊C均值聚类算法是一种基于k-均值聚类方法构建而来的模糊聚类技术。其核心特点是其隶属度函数可取自[0,1]区间内的任意数值,并非严格对应单一中心点归属关系;该算法主要依据"基于内部加权平方误差最小化的分类原则"构建
- 该方法适用于处理遵循正态分布的数据集,并展现出良好的聚类效果。
- 该系统具有简洁明了的设计,并能够解决广泛的应用场景;同时,在多个应用领域中表现出良好的适用性。
模糊的c均值聚类算法缺点:
- 选择合适的聚类数目c以及模糊系数m能够使算法达到良好的聚类效果...其中要求合适数量的c远低于数据集总样本的数量且必须满足c大于1...对于参数m而言合理的取值至关重要$m过大会导致算法收敛至较差的结果而当参数m取较小值时则退化为传统的k均值算法。
- 算法容易陷入局部最优解的情况
综上所述,在此背景下探讨两种策略以实现对数据进行聚类分析。这些策略均无法确保达到全局最优解,并且容易陷入局部极小值的情况。
