引用本文: | 梁少军,张世荣,孙澜琼.基于最优密度方向的等距映射降维算法[J].控制理论与应用,2021,38(4):467~478.[点击复制] |
LIANG Shao-jun,ZHANG Shi-rong,SUN Lan-qiong.Optimal density direction based isometric mapping dimensionality reduction algorithm[J].Control Theory and Technology,2021,38(4):467~478.[点击复制] |
|
基于最优密度方向的等距映射降维算法 |
Optimal density direction based isometric mapping dimensionality reduction algorithm |
摘要点击 2419 全文点击 875 投稿时间:2020-07-14 修订日期:2021-03-22 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2020.00454 |
2021,38(4):467-478 |
中文关键词 等距映射 流形学习 自然邻居 最优密度 |
英文关键词 ISOMAP manifold learning natural neighbor optimal density |
基金项目 国家自然科学基金项目(51475337), 陆军军内科研项目(LJ20182B050054, LJ20191C040483, LJ20202C020416, LJ20202C050412)资助. |
|
中文摘要 |
等距映射算法(ISOMAP)是一种典型的非线性流形降维算法, 该算法可在尽量保持高维数据测地距离与低
维数据空间距离对等关系的基础上实现降维. 但ISOMAP容易受噪声的影响, 导致数据降维后不能保持高维拓扑结
构. 针对这一问题, 提出了一种基于最优密度方向的等距映射(ODD–ISOMAP)算法. 该算法通过筛选数据的自然邻
居确定每个数据沿流形方向的最优密度方向, 之后基于与各近邻数据组成的向量相对最优密度方向投影的角度、
方向和长度合理缩放局部邻域距离, 引导数据沿流形方向计算测地距离, 从而降低算法对噪声的敏感度. 为验证算
法有效性, 选取了2类人工合成数据和5类实测数据作为测试数据集, 分别使用ISOMAP, LLE, HLLE, LTSA, LEIGS,
PCA和ODD–ISOMAP算法对数据集降维, 并对降维数据进行K-mediods聚类分析. 通过比对聚类正确率以及不同幅
度噪声对此正确率的影响程度评价各算法降维效果优劣. 结果表明, ODD–ISOMAP算法较其他6种常见算法降维效
果提升显著, 且对噪声干扰有更强的抵抗能力. |
英文摘要 |
As a typical nonlinear dimensionality reduction algorithm, ISOMAP can realize the dimensionality reduction
based on the corresponding relationship between the high-dimensional geodesic distance and the low-dimensional spatial
distance. However, the classical ISOMAP algorithm is heavily affected by noise making it difficult to maintain the topology
in low dimensionality. To address this problem, an optimal density direction based ISOMAP algorithm is proposed.
Firstly, the algorithm screens the optimal density direction of each data along the manifold direction by filtering its natural
neighbors, then composes a vector with the data and its neighbors. After that, the local neighborhood distance is reasonably
scaled according to the angle, direction and length of the projection of the vector relative to the optimal density direction.
In this way, the geodesic distance is guided to be calculated along the manifold direction so as to reduce the sensitivity
to noise. In order to verify the effectiveness of the algorithm, 2 types of artificial data sets and 5 types of measured data
sets are selected as the test cases. ISOMAP, LLE, HLLE, LTSA, LEIGS, PCA, and ODD–ISOMAP algorithms are applied
on data sets respectively. Moreover, K-mediods clustering algorithm is performed after dimension reduction. The dimensionality
reduction effect of each algorithm is evaluated by comparing the clustering accuracy and the influence degree of
different amplitude noise on the accuracy. Experimental results show that the effect of ODD–ISOMAP algorithm has been
significantly improved comparing with the other 6 common algorithms, and it has stronger resistance to noises as well. |
|
|
|
|
|