引用本文:吕莉,朱梅子,康平,韩龙哲.面向密度分布不均数据的混合近邻密度峰值聚类算法[J].控制理论与应用,2024,41(10):1821~1830.[点击复制]
LV Li,ZHU Mei-zi,KANG Ping,HAN Long-zhe.Multiplex neighbor density peaks clustering for uneven density data sets[J].Control Theory and Technology,2024,41(10):1821~1830.[点击复制]
面向密度分布不均数据的混合近邻密度峰值聚类算法
Multiplex neighbor density peaks clustering for uneven density data sets
摘要点击 3238  全文点击 45  投稿时间:2022-07-01  修订日期:2024-05-18
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2023.20584
  2024,41(10):1821-1830
中文关键词  密度峰值聚类  局部密度  自然近邻  共享近邻  样本相似性
英文关键词  density peaks clustering  local density  natural neighbors  shared neighbors  sample similarity
基金项目  国家自然科学基金项目(62066030, 61962036)资助.
作者单位E-mail
吕莉* 南昌工程学院 lvli623@163.com 
朱梅子 南昌工程学院 1411918938@qq.com 
康平 南昌工程学院  
韩龙哲 南昌工程学院  
中文摘要
      密度峰值聚类算法(DPC)的局部密度忽略了密度分布不均数据的疏密差异, 易导致类簇中心聚集在密集区域; 其分配策略在分配剩余样本时, 易将稀疏区域样本错误分配到密集区域, 致使聚类效果不佳. 为克服上述缺陷,本文提出了面向密度分布不均数据的混合近邻密度峰值聚类(MN-DPC)算法. 首先, 利用自然近邻信息定义样本的局部密度, 平衡稀疏区域与密集区域样本之间的密度差异, 从而正确找到稀疏区域的类簇中心; 其次, 利用样本之间的共享及自然近邻信息对样本相似度进行加权处理, 加强了同一类簇样本间的相似度, 有效的避免稀疏区域样本被错误分配. 本文将MN-DPC算法与IDPC-FA, DPC-DBFN, DPCSA, FNDPC, FKNN-DPC, DPC算法进行对比. 实验结果表明, MN-DPC算法能有效聚类密度分布不均及UCI数据集.
英文摘要
      The local density of density peaks clustering (DPC) algorithm ignores the density difference of the data with uneven density distribution, which easily leads to the cluster centers found in the dense area resulting in poor clustering effect. In order to overcome the above shortcomings, this paper proposes multiplex neighbor density peaks clustering for uneven density data sets (MN-DPC). Firstly, the natural nearest neighbor information is used to define the local density of samples to balance the density difference between samples in sparse and dense regions, so as to correctly find the class cluster centers in sparse regions; Secondly, the sample similarity is weighted by using the shared nearest neighbor and natural nearest neighbor information, which strengthens the similarity between samples of the same type of cluster and effectively avoids the misallocation of samples in sparse regions. This paper compares the MN-DPC algorithm with the IDPC-FA, DPC-DBFN, DPCSA, FNDPC, FKNN-DPC and DPC algorithms. The experimental results show that the MN-DPC algorithm can effectively cluster data sets with uneven density distribution and UCI data sets.