引用本文: | 潘金艳,高朋,高云龙,谢有为,熊裕慧.基于可靠性的鲁棒模糊聚类[J].控制理论与应用,2021,38(4):516~528.[点击复制] |
PAN Jin-yan,GAO Peng,GAO Yun-long,XIE You-wei,XIONG Yu-hui.Reliability-based of robust fuzzy flustering[J].Control Theory and Technology,2021,38(4):516~528.[点击复制] |
|
基于可靠性的鲁棒模糊聚类 |
Reliability-based of robust fuzzy flustering |
摘要点击 2688 全文点击 813 投稿时间:2020-07-23 修订日期:2020-11-12 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2020.00480 |
2021,38(4):516-528 |
中文关键词 模糊C均值(FCM) 类不均衡 集成学习 k近邻约束 局部信息 |
英文关键词 fuzzy C-means (FCM) size imbalance ensemble learning k-nearest neighbor constraint local information |
基金项目 国家自然科学基金项目(61203176), 福建省自然科学基金项目(2013J05098, 2016J01756)资助. |
|
中文摘要 |
相比于k-means算法, 模糊C均值(FCM)通过引入模糊隶属度, 考虑不同数据簇之间的相互作用, 进而避免
了聚类中心趋同性问题. 然而模糊隶属度具有拖尾和翘尾的结构特征, 因此使得FCM算法对噪声点和孤立点很敏
感; 此外, 由于FCM算法倾向于将各数据簇均等分, 因此算法对数据簇大小也很敏感, 对非平衡数据簇聚类效果不
佳. 针对这些问题, 本文提出了基于可靠性的鲁棒模糊聚类算法(RRFCM). 该算法基于当前的聚类结果, 对样本点
进行可靠性分析, 利用样本点的可靠性和局部近邻信息, 突出不同数据簇之间的可分性, 从而提高了算法对噪声的
鲁棒性, 并且降低了对非平衡数据簇大小的敏感性, 得到了泛化性能更好的聚类结果. 与相关算法进行对比,
RRFCM算法在人造数据集, UCI真实数据集以及图像分割实验中均取得最优的结果. |
英文摘要 |
Compared with the k-means algorithm, fuzzy C-means (FCM) considers the interaction between different
data clusters by introducing fuzzy membership degree, thus avoiding the clustering center overlapping problem. However,
fuzzy membership degree has the structural characteristics of trailing and warp-tail, which makes FCM algorithm very
sensitive to noise points and outliers. In addition, the FCM algorithm tends to classify the data cluster with average size, so
it is sensitive to data cluster size also, which makes the algorithm not good for clustering imbalanced data clusters. To solve
these problems, a reliability–based of robust fuzzy clustering algorithm (RRFCM) is proposed in this paper. The algorithm
is based on the current clustering results, the reliability analysis was carried out on the sample points, using the reliability of
the sample points and local neighbor information, highlight the separability between different data clusters, so as to improve
the robustness of the algorithm for noises, and reduce the sensitivity to cluster size and behave better on unbalanced data
cluster size, better generalization capability of the clustering results are obtained. Compared with related algorithms, the
algorithm achieves the optimal results in artificial data sets, UCI real data sets and image segmentation experiments. |
|
|
|
|
|