密度敏感模糊核最大熵聚类算法

李烨桐; 郭洁; 祁霖; 刘璇; 阮鹏宇; 陶新民

引用本文:	李烨桐,郭洁,祁霖,刘璇,阮鹏宇,陶新民.密度敏感模糊核最大熵聚类算法[J].控制理论与应用,2022,39(1):67~82.[点击复制]
	LI Ye-tong,GUO Jie,QI Lin,LIU Xuan,RUAN Peng-yu,TAO Xin-min.Density-sensitive fuzzy kernel maximum entropy clustering algorithm[J].Control Theory and Technology,2022,39(1):67~82.[点击复制]

密度敏感模糊核最大熵聚类算法

Density-sensitive fuzzy kernel maximum entropy clustering algorithm

摘要点击 2082 全文点击 534 投稿时间：2021-02-26 修订日期：2021-04-26

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2021.10168

2022,39(1):67-82

中文关键词聚类相对密度最大熵聚类算法鲁棒性

英文关键词 clustering relative density maximum entropy clustering algorithm robustness

基金项目国家自然科学基金项目(62176050), 中央高校基本科研业务费专项资金项目(2572017EB02), 东北林业大学双一流科研启动基金项目(411112438), 哈尔滨市科技局创新人才基金项目(2017RAXXJ018), 东北林业大学大学生创新创业训练计划项目(202010225188)资助.

作者	单位	E-mail
李烨桐	东北林业大学工程技术学院	liyetong8168@163.com
郭洁	东北林业大学工程技术学院
祁霖	东北林业大学工程技术学院
刘璇	东北林业大学工程技术学院
阮鹏宇	东北林业大学工程技术学院
陶新民^*	东北林业大学工程技术学院	taoxinmin@nefu.edu.cn

中文摘要

提出一种密度敏感模糊核最大熵聚类算法. 该算法首先通过核函数将原始非线性非高斯的数据集转化为核空间数据集, 然后利用核函数的相似性抵消不属于该聚类的样本数据在聚类过程中对聚类中心求解的干扰, 消除正则化系数对聚类结果的影响, 进而抑制传统最大熵聚类算法的趋同性. 最后通过引入相对密度项, 解决因样本数据在特征空间的分布差异而导致的聚类中心求解偏差问题, 从而提高聚类结果的准确性. 实验部分, 本文讨论了算法参数间的关系以及对聚类结果的影响. 通过与传统模糊C均值聚类算法、核模糊C均值聚类算法、最大熵聚类算法、最大熵规范化权重核模糊C均值聚类算法以及其他两种改进最大熵聚类算法的聚类结果进行对比分析, 结果表明本文提出的密度敏感模糊核最大熵聚类算法的聚类性能明显优于其他算法.

英文摘要

In order to solve the clustering problem of nonlinear non-Gaussian datasets, a density-sensitive fuzzy kernel maximum entropy clustering algorithm is proposed. The algorithm firstly transforms the nonlinear non-Gaussian dataset into kernel-space dataset through a kernel function, and then uses the similarity of the kernel function to cancel the interference of sample data which do not belong to the clustering on the solution of the clustering center in the clustering process. This may be helpful to eliminate the influence of the regularization coefficient on the clustering result, and further inhibit the convergence of the traditional clustering algorithm. Finally, the relative density term is introduced to solve the deviation problem of clustering center solution caused by the difference of sample data distribution in feature space, thus improving the accuracy of clustering results. In the experimental part, the relationship between the algorithm parameters and the influence on the clustering results are discussed. By comparing the clustering results with the fuzzy C-means clustering algorithm, the kernel fuzzy C-means clustering algorithm, the maximum entropy clustering algorithm, the maximum entropy normalized weight kernel fuzzy C-means clustering algorithm, and other two modified maximum entropy clustering algorithms, the clustering performance of the density sensitive fuzzy maximum entropy clustering algorithm proposed in this paper is obviously better than other algorithms.