基于数据增强的多模态虚假信息检测框架研究

摘要/Abstract

摘要： 随着多媒体技术的发展，传播者倾向于制造具有多模态内容的虚假信息，以吸引新闻读者的注意力.然而，基于少量标注的多模态数据提取特征，并对多模态数据中的隐含线索进行有效融合以生成虚假信息的向量表示具有一定挑战性.为了解决该问题，提出了一种基于数据增强的多模态虚假信息检测框架(dataenhanced multimodal false information detection framework, DEMF).DEMF充分利用预训练模型训练优势以及数据增强技术以减少对标注数据的依赖；并使用多层次的模态特征提取与融合技术，同时捕捉细粒度的元素级关系和粗粒度的模态级关系，以充分提取多模态线索.在真实数据集上的实验表明，DEMF明显优于先进的基线模型.

关键词: 虚假信息检测, 多模态, 深度学习, 数据增强, 预训练

Abstract: With the development of multimedia technology, rumor spreaders tend to create false information with multimodal content to attract the attention of news readers. However, it is challenging to extract features from sparsely annotated multimodal data and effectively integrate implicit clues in the multimodal data to generate vector representations of false information. To address this issue, we propose a DEMF(dataenhanced multimodal false information detection framework). DEMF leverages the advantages of pretrained models and data augmentation techniques to reduce reliance on annotated data; it utilizes multilevel modal feature extraction and fusion to simultaneously capture finegrained elementlevel relationships and coarsegrained modallevel relationships, in order to fully extracting multimodal clues. Experiments on realworld datasets show that DEMF significantly outperforms stateoftheart baseline models.

Key words: false information detection, multimodal, deep learning, data augmentation, pretrained

中图分类号:

TP387

刘宇栋, 黄千里, 王恒, 范洁, . 基于数据增强的多模态虚假信息检测框架研究[J]. 信息安全研究, 2025, 11(4): 377-.

参考文献

［1］Wang Y, Ma F, Jin Z, et al. EANN: Event adversarial neural networks for multimodal fake news detection［C］ Proc of Knowledge Discovery and Data Mining. New York: ACM, 2018: 849857［2］Wang Y, Qian S, Hu J, et al. Fake news detection via knowledgedriven multimodal graph convolutional networks［C］ Proc of the 2020 Int Conf on Multimedia Retrieval. New York: ACM, 2020: 540547［3］Lu J, Batra D, Parikh D, et al. ViLBERT: Pretraining taskagnostic visiolinguistic representations for visionandlanguage tasks［J］. arXiv preprint, arXiv: 1908.02265, 2019［4］Su W, Zhu X, Cao Y, et al. VLBERT: Pretraining of generic visuallinguistic representations［J］. arXiv preprint, arXiv:1908.08530, 2019［5］Devlin J, Chang M W, Lee K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding［J］. arXiv preprint, arXiv:1810.04805, 2018［6］Ren S, He K, Girshick R, et al. Faster RCNN: Towards realtime object detection with region proposal networks［J］. IEEE Trans on Pattern Analysis & Machine Intelligence, 2017, 39(6): 11371149［7］Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need［J］. arXiv preprint, arXiv:1706.03762, 2017［8］徐梓航. 基于社会网络分析法的微博网络舆情传播应用研究［J］. 信息安全研究, 2023, 9(7): 693700［9］刘知远, 张乐, 涂存超, 等. 中文社交媒体谣言统计语义分析［J］. 中国科学:信息科学, 2015, 45(12): 15361546［10］Khattar D, Goud J S, Gupta M, et al. Mvae: Multimodal variational autoencoder for fake news detection［C］ Proc of the World Wide Web Conference. New York: ACM, 2019: 29152921［11］Wang Z, Guo Y. Rumor events detection enhanced by encoding sentimental information into time series division and word representations［J］. Neurocomputing, 2020, 397: 224243［12］Jin Z, Cao J, Zhang Y, et al. News verification by exploiting conflicting social viewpoints in microblogs［C］ Proc of the AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2016: 29722982［13］Zhang J, Cui L, Fu Y, et al. Fake news detection with deep diffusive network model［J］. arXiv preprint, arXiv:1805.08751, 2018［14］Kaliyar R K, Goswami A, Narang P. FakeBERT: Fake news detection in social media with a BERTbased deep learning approach［J］. Multimedia Tools and Applications, 2021, 80(8): 1176511788［15］Goodfellow I, PougetAbadie J, Mirza M, et al. Generative adversarial nets［J］. arXiv preprint, arXiv:1406.2661, 2019［16］Ma J, Gao W, Wong K F. Rumor detection on twitter with treestructured recursive neural networks［C］ Proc of Association for Computational Linguistics. Stroudsburg, PA: ACL, 2018: 19801989［17］Shu K, Cui L, Wang S, et al. defend: Explainable fake news detection［C］ Proc of the 25th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining. New York: ACM. 2019: 395405［18］Hu H, Gu J, Zhang Z, et al. Relation networks for object detection［C］ Proc of the IEEE Conf on CVPR. Piscataway, NJ: IEEE, 2018: 35883597［19］He K, Zhang X, Ren S, et al. Deep residual learning for image recognition［C］ Proc of the IEEE Conf on CVPR. Piscataway, NJ: IEEE, 2016: 770778［20］Boididou C, Papadopoulos S, DangNguyen D, et al. Verifying multimedia use at mediaeval［J］. Work Notes Proc MediaEval, 2016, 1739(3): 13［21］Zubiaga A, Liakata M, Procter R. Exploiting context for rumour detection in social media［C］ Proc of the 9th Int Conf on Social Informatics. Berlin: Springer, 2017: 109123［22］Dietterich T G. Approximate statistical tests for comparing supervised classification learning algorithms［J］. Neural Computation, 1998, 10(7): 18951923［23］Loshchilov I, Hutter F. Fixing weight decay regularization in adam［J］. arXiv preprint, arXiv:1711.05101, 2017［24］Yao L, Mao C, Luo Y. Graph convolutional networks for text classification［C］ Proc of the AAAI Conf on AI. Menlo Park, CA: AAAI, 2019: 73707377［25］Tian T, Liu Y, Sun M, et al. Multimodal false information detection based on adversarial learning［C］ Proc of 2022 Int Joint Conf on Neural Networks. Piscataway, NJ: IEEE, 2022: 19

[1]	池亚平, 彭文龙, 徐子涵, 陈颖, . 基于深度学习的加密网站指纹识别方法[J]. 信息安全研究, 2025, 11(4): 304-.
[2]	李晓东, 李慧, 赵炽野, 周苏雅, 金鑫, . 基于模分量同态加密的隐私数据联邦学习研究[J]. 信息安全研究, 2025, 11(3): 198-.
[3]	李秀滢, 赵海淇, 陈雪松, 张健毅, 赵成, . 基于YOLOv8目标检测器的对抗攻击方案设计[J]. 信息安全研究, 2025, 11(3): 221-.
[4]	何德芬, 江倩, 金鑫, 冯明, 苗圣法, 易华松, . 基于ConvNeXt的伪造人脸检测方法[J]. 信息安全研究, 2025, 11(3): 231-.
[5]	李聪聪, 袁子龙, 滕桂法, . 基于深度学习的时空特征融合网络入侵检测模型研究[J]. 信息安全研究, 2025, 11(2): 122-.
[6]	曾庆鹏, 贺述明, 柴江力, . 融合多模态特征的恶意TLS流量检测方法[J]. 信息安全研究, 2025, 11(2): 130-.
[7]	李猛坤, 李柯锦, 王琪, 袁晨, 吕慧颖, 应作斌, . 面向社交网络平台的多模态网络欺凌检测模型研究[J]. 信息安全研究, 2025, 11(2): 154-.
[8]	钱汉伟, 彭季天, 袁明, 高光亮, 刘晓迁, 王群, 朱景羽, . 影响预训练语言模型数据泄露的因素研究[J]. 信息安全研究, 2025, 11(2): 181-.
[9]	郑嘉熙, 陈伟, 尹萍, 张怡婷, . 基于可解释性的不可见后门攻击研究[J]. 信息安全研究, 2025, 11(1): 21-.
[10]	李为, 袁泽坤, 吴克河, 程瑞, . 基于注意力机制和多尺度卷积神经网络的容器异常检测[J]. 信息安全研究, 2025, 11(1): 35-.
[11]	周成胜, 孟楠, 赵勋, 邱情芳, . 基于深度学习的多会话协同攻击加密流量检测技术研究[J]. 信息安全研究, 2025, 11(1): 66-.
[12]	刘文龙, 文斌, 马梦帅, 杜宛蓉, 魏晓寻, . 多种深度学习融合的网络流量异常检测模型[J]. 信息安全研究, 2024, 10(E2): 54-.
[13]	卢轩, 吴建华, 龚一轩, 施天宇, . 基于LSTM的充电桩恶意流量入侵检测[J]. 信息安全研究, 2024, 10(E2): 134-.
[14]	陈先意, 周浩, 刘腾骏, 闫雷鸣, . 基于注意力机制和护照层嵌入的图像处理模型水印方法[J]. 信息安全研究, 2024, 10(9): 849-.
[15]	文津, 蒋凯元, 韩禹洋, 王志强, 罗乐琦, 田文亮, . 基于Transformer与图卷积网络的行为冲突检测模型[J]. 信息安全研究, 2024, 10(8): 729-.