引用本文:邹月娴,刘诗涵,王迪松.多重约束非负矩阵分解的非平稳噪声语音增强[J].控制理论与应用,2017,34(6):761~768.[点击复制]
ZOU Yue-xian,LIU Shi-han,WANG Di-song.Enhancing speech corrupted by nonstationary noise using nonnegative matrix factorization with multiple constraints[J].Control Theory and Technology,2017,34(6):761~768.[点击复制]
多重约束非负矩阵分解的非平稳噪声语音增强
Enhancing speech corrupted by nonstationary noise using nonnegative matrix factorization with multiple constraints
摘要点击 3819  全文点击 1720  投稿时间:2016-08-11  修订日期:2017-03-24
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2017.60600
  2017,34(6):761-768
中文关键词  语音增强  低秩约束  稀疏约束  非负矩阵分解  非稳态噪声
英文关键词  speech enhancement  low-rank  sparsity  nonnegative matrix factorization  nonstationary noise
基金项目  国家自然科学基金;其它
作者单位E-mail
邹月娴* 北京大学信息工程学院现代信号与数据处理实验室 cynthiazou@qq.com 
刘诗涵 北京大学信息工程学院现代信号与数据处理实验室  
王迪松 北京大学信息工程学院现代信号与数据处理实验室  
中文摘要
      低信噪比非稳态噪声环境中的语音增强仍是一个开放且具有挑战性的任务. 为了提高传统的基于非负矩阵分解(nonnegative matrix factorization, NMF)的语音增强算法性能, 同时考虑到语音信号的时频稀疏特性和非稳态噪声信号的低秩特性, 本文提出了一种基于多重约束的非负矩阵分解语音增强算法(multi-constraint nonnegative matrix factorization speech enhancement, MC–NMFSE). 在训练阶段, 采用干净语音训练数据集和噪声训练数据集分别构建语音字典和噪声字典. 在语音增强阶段, 在非负矩阵分解目标函数中增加语音分量的稀疏性约束和噪声信号的低秩性约束条件, MC–NMFSE能够更好地从带噪语音中获得语音分量的表示, 从而提高语音增强效果. 通过实验表明, 在大量不同非平稳噪声条件和不同信噪比条件下, 与传统的基于NMF的语音增强方法相比, MC–NMFSE能获得较低的语音失真和更好的非稳态噪声抑制能力.
英文摘要
      The enhancement of speech corrupted by nonstationary noises under low signal-to-noise ratio (SNR) conditions is remaining open and still a very challenging task. To improve the traditional nonnegative matrix factorization (NMF) based speech enhancement, jointly taking the speech sparsity property in time-frequency domain and the low-rank property of nonstationary noise into account, a termed multi-constraint NMF speech enhancement method (MC–NMFSE) is developed. Essentially, in training stage, the speech and noise dictionaries have been constructed by using speech and noise training sets, respectively. In the speech enhancement stage, multi-constraint NMF method is adopted where the data matrix is factorized into two nonnegative sub-matrices with the sparsity and low rank constraints to guarantee the good representation of the speech components from their corrupted version by nonstationary noise. Compared with the traditional NMF speech enhancement method (NMF–SpEnM) and MC–NMFSE, intensive experiments under different nonstationary noise conditions and different signal-to-noise ratios have been carried out to evaluate their performance. Experimental results demonstrate that MC–NMFSE has lower speech distortion and better capability to suppress nonstationary noises.