引用本文: | 陆思洁,范頔,渐令,郜传厚.集成数据挖掘知识的可解释最优超球体支持向量机[J].控制理论与应用,2024,41(3):375~384.[点击复制] |
LU Si-jie,FAN Di,JIAN Ling,GAO Chuan-hou.Interpretable small sphere and large margin support vector machine with integrated data mining knowledge[J].Control Theory and Technology,2024,41(3):375~384.[点击复制] |
|
集成数据挖掘知识的可解释最优超球体支持向量机 |
Interpretable small sphere and large margin support vector machine with integrated data mining knowledge |
摘要点击 3235 全文点击 255 投稿时间:2022-09-22 修订日期:2024-02-27 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2023.20832 |
2024,41(3):375-384 |
中文关键词 黑箱模型 可解释性 最优超球体支持向量机 先验知识 不平衡数据 |
英文关键词 black box model interpretability small sphere and large margin support vector machine prior knowledge unbalanced data |
基金项目 国家自然科学基金项目(12320101001, 12071428, 62111530247), 浙江省自然科学基金重点项目(LZ20A010002)资助. |
|
中文摘要 |
最优超球体支持向量机(SSLM)是一种典型的黑箱模型, 其运行模式不需要考察被研究对象的内部结构和
机理, 仅利用对象的输入输出数据即能达到认识其功能和作用机制, 因此具有响应快、实时性强等优点, 但也因此
缺乏可解释性和透明性. 鉴于此, 本文研究从SSLM黑箱模型的输入端加入先验知识的方法, 增强其可解释性. 本文
开发了基于数据的非线性圆形知识挖掘算法以及知识的离散化算法, 离散后的数据点不仅包含产生知识的原始数
据点, 还增加了新的数据点. 通过将所挖掘的圆形知识以不等式约束的形式集成至SSLM模型, 构造了可解释
的SSLM模型(i-SSLM). 该模型在训练时要确保知识约束的数据点分类正确, 因此对模型结果有一定程度的预知,
表明模型具有可解释性; 同时, 又由于知识的离散化增加了新的数据信息, 因此, 模型能具有更高的精度. i-SSLM模
型的有效性在10组公共样本集和2组实际高炉数据集上得到了验证. |
英文摘要 |
Small sphere and large margin support vector machine (SSLM) is a typical black box model, which works
in no need of understanding the internal structure and mechanism of the object to be studied while only utilizes the input
and output data for the purpose of knowing its function and interaction relation. Hence, the SSLM has the advantages
of fast response and strong real-time performance, but accordingly lacks interpretability and transparency. In view of
this, this paper examines ways to add prior knowledge into the input-port of the SSLM black box model to enhance its
interpretability. We developed a nonlinear circular knowledge mining algorithm based on data as well as a discretization
algorithm for knowledge, and the discrete data points contain not only the original data points that generated the knowledge,
but also add new data points. By integrating the mined circular knowledge into the SSLM model in the form of inequality
constraints, we construct an interpretable SSLM model (i-SSLM). When the model is trained, it is necessary to ensure that
the data point classification of the knowledge constraint is correct, so there is a certain degree of prediction of the model
results, indicating that the model is interpretable. At the same time, due to the discretization of knowledge to add new data
information, the model can have higher accuracy. The validity of the i-SSLM model was verified on 10 sets of common
sample sets and 2 sets of actual blast furnace datasets. |
|
|
|
|
|