信息安全研究 ›› 2017, Vol. 3 ›› Issue (9): 781-794.

• 学术论文 • 上一篇    下一篇

面向舆情的社交媒体文本倾向性分析

朱岩   

  1. 北京科技大学
  • 收稿日期:2017-09-07 出版日期:2017-09-15 发布日期:2017-09-06
  • 通讯作者: 朱岩

Analysis on Social Media Text Orientation Oriented on Public Opinion

Yan Zhu   

  • Received:2017-09-07 Online:2017-09-15 Published:2017-09-06
  • Contact: Yan Zhu

摘要: 针对不同领域网络社交媒体在舆情分析中参与人群不同、情感表达动态多变的特点,本文以静态情感词典为基础,基于不同领域媒体下特征词在各个情感类别下互信息值的显著性差异和阈值实验分析,给出了一种针对不同语料构造动态领域情感词典的新方法.并依据正/负面训练语料统计分布的似然估计,提出了一种从词性、极性词、词条统计特性等高维特征中采用渐进结构进行最优特征选择的特征筛选算法,可针对领域媒体通过实验得出最优特征组合方式.为验证有效性,使用常见分类算法分别对多种领域语料进行实验对比,结果表明本文所采用的方法对多种类型语料均适用,且均取得了较好的分类效果.

关键词: 社交媒体, 情感倾向性, 动态情感词典, 特征筛选, 特征分析

Abstract: In this paper, we address the problem of text emotional orientation analysis on public opinion for diverse participants and dynamic changeable emotional s in different domain social media. To solve this problem, our research is based on Static Emotional Dictionary (SED), and provides a new scheme for Dynamic Domain Sentiment Dictionary (DDSD) derived from significant differences of mutual information between Pos/Neg emotional categories for each word, as well as its threshold’s experimental evaluation, for different domain social media. Next, according to the principle of likelihood estimate of statistical distribution of Pos/Neg training corpus, we propose a Feature Selection Algorithm with Evolutionary Structural Optimization method (FSA-ESO), which chooses optimal features from part-of-speech, polar words and statistical characteristic of words. By using this algorithm, the optimal combination of features can be experimentally evaluated from the different domain social media. Finally, to validate such a combination of features, the experimental comparisons of different classification algorithms are made over several domain social media, respectively. The experiment’s results indicate that our proposed method is applicable to different domain social media and get better performance for text emotional orientation classification.

Key words: social media, emotional orientation, dynamic sentiment dictionary, feature selection, feature analysis