基于ConvNeXt的伪造人脸检测方法

摘要/Abstract

摘要： 由深度生成模型生成的虚假图像越发逼真，这些图像已经超越了人眼的识别能力.这种模型已成为编造谎言、制造舆论等非法活动的新工具.虽然当前研究者已经提出了很多检测方法检测伪造图像，但泛化能力普遍不高，因此，提出了一种基于ConvNeXt的伪造人脸检测方法.首先在ConvNeXt的第2个和第3个下采样模块后添加极化自注意(polarization selfattention, PSA)模块，使网络具有空间注意力和通道注意力的性能.其次在ConvNeXt的尾部设计一个信息富余模块(rich imformation block, RIB)，以丰富网络学习到的信息，通过该模块对信息进行处理后再进行最终的分类.此外，网络训练使用的损失函数是交叉熵损失与KL(KullbackLeibler)散度的结合.在当前主流的伪造人脸数据集上作了大量的实验，实验结果表明该方法在FF++高质量数据集上无论是准确率还是泛化性都超过所有对比方法.

关键词: 神经网络, 深度学习, 伪造人脸, 特征提取, 伪造图像检测

Abstract: The fake images generated by deep generative models are becoming increasingly realistic, surpassing the human eye’s ability to detect them. These models have become new tools for illegal activities, such as fabricating lies and creating public opinion. Although current researchers have proposed many detection methods to detect fake images, their generalization ability is typically limited. To address this issue, we proposed a fake face detection method based on ConvNeXt. Firstly, we add a PSA(polarization selfattention) module after the second and third downsampling modules of ConvNeXt, enhancing the network’s spatial and channel attention performance. Secondly, a RIB(rich imformation block) is designed at the end of ConvNeXt to enrich the information learned by the network. The information is processed through this module before final classification. Furthermore, the loss function used in network training is a combination of CrossEntropy loss and KL(KullbackLeibler) divergence. Extensive experiments on the current mainstream fake face datasets demonstrate that our method surpasses all comparative methods in accuracy and generalization on the FF++ C23 dataset.

Key words: neural network, deep learning, fake face, feature extraction, fake image detection

中图分类号:

TP391

何德芬, 江倩, 金鑫, 冯明, 苗圣法, 易华松, . 基于ConvNeXt的伪造人脸检测方法[J]. 信息安全研究, 2025, 11(3): 231-.

参考文献

［1］Sun Xinwei, Wu Botong, Chen Wei. Identifying invariant texture violation for robust DeepFake detection［J］. arXiv preprint, arXiv:2012.10580, 2020［2］Li Lingzhi, Bao Jianmin, Zhang Ting, et al. Face Xray for more general face forgery detection［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 50015010［3］Afchar D, Nozick V, Yamagishi J, et al. Mesonet: A compact facial video forgery detection network［C］ Proc of 2018 IEEE Int Workshop on Information Forensics and Security (WIFS). Piscataway, NJ: IEEE, 2018: 17［4］彭舒凡, 蔡满春, 刘晓文, 等. 基于图像细粒度特征的深度伪造检测算法［J］. 信息网络安全, 2022, 22(11): 7784［5］张亚, 金鑫, 江倩, 等. 基于自动编码器的深度伪造图像检测方法［J］. 计算机应用, 2021, 41(10): 29852990［6］Li Yuezun, Chang Mingching, Lyu S. In ictu oculi: Exposing AI created fake videos by detecting eye blinking［C］ Proc of 2018 IEEE Int Workshop on Information Forensics and Security (WIFS). Piscataway, NJ: IEEE, 2018: 17［7］Qi Hua, Guo Qing, JuefeiXu F, et al. Deeprhythm: Exposing DeepFakes with attentional visual heartbeat rhythms［C］ Proc of the 28th ACM Int Conf on Multimedia. New York: ACM, 2020: 43184327［8］Liu Zhengzhe, Qi Xiaojuan, Torr P H S. Global texture enhancement for fake face detection in the wild［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 80608069［9］Zhao Hanqing, Zhou Wenbo, Chen Dongdong, et al. Multiattentional DeepFake detection［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2021: 21852194［10］Luo Yuchen, Zhang Yong, Yan Junchi, et al. Generalizing face forgery detection with highfrequency features［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2021: 1631216321［11］Durall R, Keuper M, Pfreundt F J, et al. Unmasking DeepFakes with simple features［J］. arXiv preprint, arXiv:1911.00686, 2019［12］黄珊珊, 金鑫, 吴楠, 等. 结合频域信息与对抗网络的虚假图像检测［J］. 信息安全学报, 2023, 8(6): 3747［13］Liu Honggu, Li Xiaodan, Zhou Wenbo, et al. Spatialphase shallow learning: Rethinking face forgery detection in frequency domain［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2021: 772781［14］Wang Gaojian, Jiang Qian, Jin Xin, et al. MCLCR: Multimodal contrastive classification by locally correlated representations for effective face forgery detection［J］. KnowledgeBased Systems, 2022, 250: 109114［15］Qian Yuyang, Yin Guojun, Sheng Lu, et al. Thinking in frequency: Face forgery detection by mining frequencyaware clues［C］ Proc of European Conf on Computer Vision. Berlin: Springer, 2020: 86103［16］Masi I, Killekar A, Mascarenhas R M, et al. Twobranch recurrent network for isolating DeepFakes in videos［C］ Proc of the 16th European Conf on Computer Vision. Berlin: Springer, 2020: 667684［17］Deng Liwei, Wang Jiandong, Liu Zhen. Cascaded network based on efficientnet and transformer for DeepFake video detection［J］. Neural Processing Letters, 2023 ［20250215］. https:link.springer.comarticle10.1007s11063023112496［18］Liu Zhuang, Mao Hanzi, Wu Chaoyun, et al. A convnet for the 2020s［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 1197611986［19］Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need［J］. arXiv preprint, arXiv:1706.03762, 2017［20］Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale［J］. arXiv preprint, arXiv:2010.11929, 2020［21］Liu Ze, Lin Yutong, Cao Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows［C］ Proc of the IEEECVF Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2021: 1001210022［22］Liu Huajun, Liu Fuqiang, Fan Xinyi, et al. Polarized selfattention: Towards highquality pixelwise mapping［J］. Neurocomputing, 2022, 506: 158167［23］Cozzolino D, Poggi G, Verdoliva L. Recasting residualbased local descriptors as convolutional neural networks: An application to image forgery detection［C］ Proc of the 5th ACM Workshop on Information Hiding and Multimedia Security. New York: ACM, 2017: 159164［24］Rahmouni N, Nozick V, Yamagishi J, et al. Distinguishing computer graphics from natural images using convolution neural networks［C］ Proc of 2017 IEEE Workshop on Information Forensics and Security (WIFS). Piscataway, NJ: IEEE, 2017: 16［25］Bayer B, Stamm M C. A deep learning approach to universal image manipulation detection using a new convolutional layer［C］ Proc of the 4th ACM Workshop on Information Hiding and Multimedia Security. New York: ACM, 2016: 510［26］Nirkin Y, Wolf L, Keller Y, et al. DeepFake detection based on discrepancies between faces and their context［J］. IEEE Trans on Pattern Analysis and Machine Intelligence, 2021, 44(10): 61116121［27］Xie Q, Luong M T, Hovy E, et al. Selftraining with noisy student improves imagenet classification［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 1068710698［28］Li Jiaming, Xie Hongtao, Li Jiahong, et al. Frequencyaware discriminative feature learning supervised by singlecenter loss for face forgery detection［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2021: 64586467［29］Gu Qiqi, Chen Shen, Yao Taiping, et al. Exploiting finegrained face forgery clues via progressive enhancement learning［C］ Proc of the AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2022: 735743［30］Zi Bojia, Chang Minghao, Chen Jingjing, et al. Wild DeepFake: A challenging realworld dataset for DeepFake detection［C］ Proc of the 28th ACM Int Conf on Multimedia. New York: ACM, 2020: 23822390［31］Cao Junyi, Ma Chao, Yao Taiping, et al. Endtoend reconstructionclassification learning for face forgery detection［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 41134122［32］Tan Mingxing, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks［C］ Proc of Int Conf on Machine Learning. New York: PMLR, 2019: 61056114［33］Sun Ke, Yao Taiping, Chen Shen, et al. Dual contrastive learning for general face forgery detection［C］ Proc of the AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2022: 23162324［34］Nguyen H H, Fang F, Yamagishi J, et al. Multitask learning for detecting and segmenting manipulated facial images and videos［C］ Proc of the 10th 2019 IEEE Int Conf on Biometrics Theory, Applications and Systems (BTAS). Piscataway, NJ: IEEE, 2019: 18［35］Wang Junke, Wu Zuxuan, Ouyang W, et al. M2TR: Multimodal multiscale transformers for DeepFake detection［C］ Proc of the 2022 Int Conf on Multimedia Retrieval. New York: ACM, 2022: 615623

[1]	池亚平, 彭文龙, 徐子涵, 陈颖, . 基于深度学习的加密网站指纹识别方法[J]. 信息安全研究, 2025, 11(4): 304-.
[2]	刘宇栋, 黄千里, 王恒, 范洁, . 基于数据增强的多模态虚假信息检测框架研究[J]. 信息安全研究, 2025, 11(4): 377-.
[3]	李晓东, 李慧, 赵炽野, 周苏雅, 金鑫, . 基于模分量同态加密的隐私数据联邦学习研究[J]. 信息安全研究, 2025, 11(3): 198-.
[4]	武于新, 陈伟, 杨文馨, 张怡婷, 范渊, . 基于图像增强的模型防窃取研究[J]. 信息安全研究, 2025, 11(3): 214-.
[5]	李秀滢, 赵海淇, 陈雪松, 张健毅, 赵成, . 基于YOLOv8目标检测器的对抗攻击方案设计[J]. 信息安全研究, 2025, 11(3): 221-.
[6]	翁铜铜, 矫桂娥, 张文俊, . 一种融合时空特征的物联网入侵检测方法[J]. 信息安全研究, 2025, 11(3): 241-.
[7]	李聪聪, 袁子龙, 滕桂法, . 基于深度学习的时空特征融合网络入侵检测模型研究[J]. 信息安全研究, 2025, 11(2): 122-.
[8]	曾庆鹏, 贺述明, 柴江力, . 融合多模态特征的恶意TLS流量检测方法[J]. 信息安全研究, 2025, 11(2): 130-.
[9]	郑嘉熙, 陈伟, 尹萍, 张怡婷, . 基于可解释性的不可见后门攻击研究[J]. 信息安全研究, 2025, 11(1): 21-.
[10]	李为, 袁泽坤, 吴克河, 程瑞, . 基于注意力机制和多尺度卷积神经网络的容器异常检测[J]. 信息安全研究, 2025, 11(1): 35-.
[11]	刘新鹏, 傅强, 张红宝, 陈晓光, 杨满智, . 一种基于图社区检测的二进制模块化方法[J]. 信息安全研究, 2025, 11(1): 43-.
[12]	陈要伟, 娄颜超, . 基于层间交互感知注意力网络的小样本恶意域名检测[J]. 信息安全研究, 2025, 11(1): 50-.
[13]	周成胜, 孟楠, 赵勋, 邱情芳, . 基于深度学习的多会话协同攻击加密流量检测技术研究[J]. 信息安全研究, 2025, 11(1): 66-.
[14]	刘文龙, 文斌, 马梦帅, 杜宛蓉, 魏晓寻, . 多种深度学习融合的网络流量异常检测模型[J]. 信息安全研究, 2024, 10(E2): 54-.
[15]	卢轩, 吴建华, 龚一轩, 施天宇, . 基于LSTM的充电桩恶意流量入侵检测[J]. 信息安全研究, 2024, 10(E2): 134-.