信息安全研究 ›› 2020, Vol. 6 ›› Issue (3): 220-227.

• 学术论文 • 上一篇    下一篇

基于BERT的文本情感分析

刘思琴,冯胥睿瑞   

  1. 四川大学网络空间安全学院
  • 收稿日期:2020-03-02 出版日期:2020-03-10 发布日期:2020-03-02
  • 通讯作者: 刘思琴
  • 作者简介:刘思琴 硕士研究生,主要研究方向为数据分析与网络安全. 1663919767@qq.com 冯胥睿瑞 硕士研究生,主要研究方向为网络数据分析与信息安全. 893924206@qq.com

Text Sentiment Analysis Based on BERT

  • Received:2020-03-02 Online:2020-03-10 Published:2020-03-02

摘要: 现有情感分类模型大都采用Word2Vec,GloVe(global vectors)等获取文本的词向量表示,忽略了词的上下文关系,针对此问题,提出基于BERT(bidirectional encoder representations from transformers)预训练语言模型与双向长短时记忆网络(bidirectional long shortterm memory network, BLSTM)及注意力机制相结合的神经网络模型进行文本情感分析.首先通过BERT预训练模型获取包含上下文语义信息的词向量,然后利用双向长短时记忆网络提取上下文相关特征进行深度学习,最后引入注意力机制对提取出的信息分配权重,突出重点信息,进行文本情感分类.在SST(stanford sentiment treebank)数据集上测试准确率可达到88.91%,表明该方法较其他方法在分类准确率上有一定程度的提高.

关键词: 文本情感分析, BERT, 双向长短时记忆网络, 注意力机制, 词向量

Abstract: Most of the existing models adopt Word2Vec, GloVe(global vectors) and other methods to obtain the word vector representation of the text, ignoring the context relationship of the word. Aiming at this problem, a neural network model based on Bidirectional Encoder Representations from Transformers (BERT) pretraining language model with bidirectional long shortterm memory network (BLSTM) and attention mechanism was proposed for text sentiment analysis. Firstly, the word vector containing contextual semantic information is obtained through BERT pretraining language model. Secondly, BLSTM is used to extract contextual features for deep learning. Finally, attention mechanism is introduced to assign corresponding weights to the extracted deep information of the text to highlight the key information for text sentiment classifications. The accuracy rate can be 88.91% on the SST(stanford sentiment treebank) dataset. The experimental results show that the model performs better than other methods.

Key words: text sentiment analysis, BERT, BLSTM, attention mechanism, word vector