引用本文:王博,郭继昌,张艳.基于深度网络的可学习感受野算法在图像分类中的应用[J].控制理论与应用,2015,32(8):1114~1119.[点击复制]
WANG Bo,GUO Ji-chang,ZHANG Yan.Learnable receptive fields scheme in deep networks for image categorization[J].Control Theory and Technology,2015,32(8):1114~1119.[点击复制]
基于深度网络的可学习感受野算法在图像分类中的应用
Learnable receptive fields scheme in deep networks for image categorization
摘要点击 3133  全文点击 1892  投稿时间:2015-01-22  修订日期:2015-09-04
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2015.50063
  2015,32(8):1114-1119
中文关键词  图像分类  分层结构  深度网络  感受野
英文关键词  image categorization  hierarchical architecture  deep networks  receptive fields
基金项目  高等学校博士学科点专项科研基金项目(20120032110034)资助.
作者单位E-mail
王博* 天津大学 电子信息工程学院 neuwb@tju.edu.cn 
郭继昌* 天津大学 电子信息工程学院  
张艳 天津大学 电子信息工程学院  
中文摘要
      作为图像检索, 图像组织和机器人视觉的基本任务, 图像分类在计算机视觉和机器学习中受到了广泛的关注. 用于目标识别及图像分类的多种基于深度学习的模型同样引发了该领域内的极大兴趣. 本文提出了一种取代尺度不变特征变换(SIFT)和方向梯度直方图(HOG)描述子的算法, 即利用深度分层结构, 按层级学习有效的图像表示, 直接从原始像素点学习特征.该方法分别利用K--奇异值分解(K--SVD)和正交匹配追踪(OMP)进行字典训练和编 码.此外, 本文采用了同时学习分类器和用于池化的感受野方案. 实验结果证明, 上述算法在目标(Oxford flowers)和事件(UIUC--sports)图像分类测试集中取得了更好的分类性能.
英文摘要
      An increasing interest in computer vision and machine learning has focused on visual categorization as it is a fundamental task for image retrieval, organization and robotic vision. Over the past decade, various deep learningbased models have been proposed and broadly applied to visual recognition and categorization. In this paper, the proposed approach learns features from scratch rather than employ hand-crafted (SIFT) and (HOG) descriptors. Deep hierarchical architecture for learning effective image representations can be built up layer by layer. Specifically, K--SVD and OMP are used for training and encoding phase respectively due to their simplicity and efficiency. In addition, sum, average and max operators are three commonly strategies for pooling in modern categorization models. We aim to apply an improved scheme which learns the receptive fields for pooling together with classifier instead of traditional pooling pattern. We provide a detailed analysis in deep networks for event and object tasks respectively and compare our novel method with several stateof- the-art algorithms comprising kernel-based feature learning and saliency-weighted hierarchical sparse coding. Finally, experimental results show that our algorithm performs better on UIUC--sports and Oxford flowers datasets.