信息安全研究 ›› 2023, Vol. 9 ›› Issue (9): 868-.

• 学术论文 • 上一篇    下一篇

基于ELMoTextCNN的网络欺凌检测模型

叶水欢1葛寅辉1陈波1于泠2   

  1. 1(南京师范大学计算机与电子信息学院人工智能学院南京210023)
    2(江苏省大规模复杂系统数值模拟重点实验室(南京师范大学)南京210023)
  • 出版日期:2023-09-17 发布日期:2023-10-04
  • 通讯作者: 陈波 博士,教授.主要研究方向为信息安全、智慧教育. bchen@njnu.edu.cn
  • 作者简介:叶水欢 硕士研究生.主要研究方向为人工智能安全. shuihuanye@qq.com 葛寅辉 硕士研究生.主要研究方向为人工智能安全. yinhuigyh@qq.com 陈波 博士,教授.主要研究方向为信息安全、智慧教育. bchen@njnu.edu.cn 于泠 博士,副教授.主要研究方向为信息安全. lyu@njnu.edu.cn

Cyberbullying Detection Model Based on ELMoTextCNN

  • Online:2023-09-17 Published:2023-10-04

摘要: 网络欺凌检测是网络空间信息内容安全的重要研究内容,也关乎青少年在线安全.针对目前网络欺凌检测方案存在的训练样本少、难以处理多义词、分类性能不太理想等问题,提出一种ELMoTextCNN检测模型.该模型首先采用迁移学习思想,利用预训练的ELMo(embeddings from language models)生成动态词向量,不仅解决了网络欺凌样本规模小的问题,而且由于ELMo采用了双向长短期记忆(bidirectional long shortterm memory, BiLSTM)网络结构,会根据上下文推断每个词对应的词向量,能够根据语境理解多义词.该模型再通过擅长处理短文本数据的TextCNN(text convolutional neural network)提取文本特征,最后经过全连接层输出分类结果.实验结果证明,提出的ELMoTextCNN检测方法能够处理一词多义,并获得更好的分类检测效果.

关键词: 网络欺凌检测, 深度学习, 迁移学习, ELMo模型, TextCNN模型

Abstract: Cyberbullying detection is an important research content on cyberspace information content security, and it is also related to youth online security. Aiming at the problems of few training samples, difficulty in processing polysemous words and unsatisfactory classification performance in current cyberbullying detection schemes, an ELMoTextCNN detection model is proposed. The model first adopts the idea of transfer learning and uses pretrained embeddings from language models (ELMo) to generate dynamic word vectors, which not only solves the problem of small cyberbullying sample size, but also because ELMo uses the bidirectional long shortterm memory (BiLSTM) network structure, it will infer the word vector corresponding to each word based on the context, and can understand polysemous words according to context. The model extracts text features through a text convolutional neural network (TextCNN), which is good at processing short text data, and finally outputs the classification results through a fully connected layer. Experimental results prove that the proposed ELMoTextCNN detection method can handle the ambiguity of a word and obtain better classification and detection results.

Key words: cyberbullying detection;deep learning, transfer learning ;ELMo model, TextCNN model

中图分类号: