引用本文: | 王自强, 冯博琴.Web数据中频繁模式树的挖掘[J].控制理论与应用,2005,22(3):429~433.[点击复制] |
WANG Zi-qiang, FENG Bo-qin.Mining frequent pattern tree in Web data[J].Control Theory and Technology,2005,22(3):429~433.[点击复制] |
|
Web数据中频繁模式树的挖掘 |
Mining frequent pattern tree in Web data |
摘要点击 2049 全文点击 1879 投稿时间:2003-09-26 修订日期:2004-06-07 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/j.issn.1000-8152.2005.3.017 |
2005,22(3):429-433 |
中文关键词 数据挖掘 Web数据 频繁模式树 有序树 |
英文关键词 data mining Web data frequent pattern tree ordered tree |
基金项目 国家"八六三"高技术研究发展计划基金资助项目(2003AA1Z2610). |
|
中文摘要 |
为了高效地从半结构化WEB数据中挖掘频繁模式树,提出了把半结构化数据表示为标记、有序树,并基于最右路径扩展技术在有序树中发现所有频繁模式树的算法.其基本思想是,首先从只有一个节点的模式树开始,而新增节点只能通过添加到最右路径上来生成新的模式树,另外,还通过维护最右叶子出现次数列表来实现支持度的逐步计算.理论分析和试验结果表明该算法是可行的,并且具有计算性能线性于最大频繁模式总和的优点. |
英文摘要 |
To efficiently mine all frequent pattern trees from the semi-structured web data,the semi-structured data were modeled as labeled-ordered tree and an algorithm for mining all frequent pattern trees in an ordered data tree was proposed.This algorithm used rightmost path expansion technique,which started with pattern trees with only one node and nodes were added only to the rightmost path to generate new pattern trees.Furthermore,this algorithm maintained only the occurrences of the rightmost leaves to efficiently implement incremental computation of support.The theoretical analysis and experimental results show that this algorithm scales linearly in the total size of maximal tree pattern and works efficiently in practice. |
|
|
|
|
|