引用本文: | 刘建伟,高悦.单词嵌入表示学习综述[J].控制理论与应用,2022,39(7):1171~1193.[点击复制] |
LIU Jian-wei,GAO Yue.Survey of word embedding[J].Control Theory and Technology,2022,39(7):1171~1193.[点击复制] |
|
单词嵌入表示学习综述 |
Survey of word embedding |
摘要点击 1857 全文点击 735 投稿时间:2021-07-27 修订日期:2021-12-23 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2022.10678 |
2022,39(7):1171-1193 |
中文关键词 单词嵌入表示学习 神经网络 语言模型 跨语言 双向编码器表示 信息瓶颈 |
英文关键词 word embedding neural network language model: cross-lingual BERT information bottleneck |
基金项目 中国石油大学(北京)科研基金(2462020YXZZ023)资助 |
|
中文摘要 |
单词嵌入表示学习是自然语言处理(NLP)中最基本但又很重要的研究内容, 是所有后续高级语言处理任务
的基础. 早期的单词独热表示忽略了单词的语义信息, 在应用中常常会遇到数据稀疏的问题, 后来随着神经语言模
型(NLM)的提出, 单词被表示为低维实向量, 有效地解决了数据稀疏的问题. 单词级的嵌入表示是最初的基于神经
网络语言模型的输入表示形式, 后来人们又从不同角度出发, 提出了诸多变种. 本文从模型涉及到的语种数的角度
出发, 将单词嵌入表示模型分为单语言单词嵌入表示模型和跨语言单词嵌入表示模型两大类. 在单语言中, 根据模
型输入的颗粒度又将模型分为字符级、单词级、短语级及以上的单词嵌入表示模型, 不同颗粒度级别的模型的应用
场景不同, 各有千秋. 再将这些模型按照是否考虑上下文信息再次分类, 单词嵌入表示还经常与其它场景的模型结
合, 引入其他模态或关联信息帮助学习单词嵌入表示, 提高模型的表现性能, 故本文也列举了一些单词嵌入表示模
型和其它领域模型的联合应用. 通过对上述模型进行研究, 将每个模型的特点进行总结和比较, 在文章最后给出了
未来单词嵌入表示的研究方向和展望. |
英文摘要 |
Word embedding is the most basic research content in natural language processing (NLP), and it is a very
important research direction. It is the basis of all advanced language processing tasks, such as using word vectors to
complete various tasks in NLP. At the beginning, the one-hot ignored the semantic information of words and often led to
data sparsity in application. Later, with the development of the neural language model (NLM), words were represented as
dense and low-dimensional vectors, which effectively solved the problem of data sparsity and their high dimensionality. The
input of the models based on neural network language models are word-level word embedding, but a variety of models have
been proposed from different directions. In this survey, from the point of view of the number of languages utilizing in the
model, we divide word embedding models into single-language word embedding and cross-language word embedding. In
single-language, according to the granularity of model input, the model is divided into character-level, word-level, phraselevel
and above word embedding model. The application scenarios of models with different granularity level are different
and each has its own strengths. These models are further classified according to whether context information is considered.
At the same time, word embedding is often combined with other models, which can help to learn word embedding by
introducing other models or correlation information to improve the performance of the model. Therefore, in this survey, we
also list some joint applications of word embedding models and other domain models. Through the study and introduction
of the above models, the characteristics of each model are summarized and compared. Finally, the future research direction
and prospect of word embedding are given. |
|
|
|
|
|