融合卷积神经网络和Transformer的人脸欺骗检测模型

信息安全研究 ›› 2024, Vol. 10 ›› Issue (1): 25-.

融合卷积神经网络和Transformer的人脸欺骗检测模型

黄灵1何希平1,2贺丹1杨楚天1旷奇弦1

1(重庆工商大学人工智能学院重庆400067)
2(检测控制集成系统重庆市工程实验室(重庆工商大学)重庆400067)

出版日期:2024-01-10 发布日期:2024-01-21
通讯作者: 黄灵硕士研究生.主要研究方向为计算机视觉、深度学习、活体检测. 2207346808@qq.com
作者简介:黄灵硕士研究生.主要研究方向为计算机视觉、深度学习、活体检测. 2207346808@qq.com 何希平博士，教授.主要研究方向为计算机视觉、机器学习、大数据、模式识别. jsjhxp@ctbu.edu.cn 贺丹硕士研究生.主要研究方向为计算机视觉、深度学习、活体检测. 2020613001@email.ctbu.edu.cn 杨楚天硕士研究生.主要研究方向为计算机视觉、深度学习、人脸属性编辑. chutian_yang@163.com 旷奇弦硕士研究生.主要研究方向为计算机视觉、深度学习、人脸属性编辑. 1363164655@qq.com

Face Spoofing Detection Model with Fusion of Convolutional Neural Network and Transformer

Huang Ling1, He Xiping1,2, He Dan1, Yang Chutian1, and Kuang Qixian1#br#

#br#

1(School of Artificial Intelligence, Chongqing Technology and Business University, Chongqing 400067)
2(Chongqing Engineering Laboratory for Detection, Control and Integrated System (Chongqing Technology and Business University), Chongqing 400067)

Online:2024-01-10 Published:2024-01-21

摘要/Abstract

摘要： 在人脸反欺骗领域，大多数现有检测模型都是基于卷积神经网络(convolutional neural network, CNN)，该类方法虽能以较少的参数学习人脸识别，但其感受野是局部的；而基于Transformer的方法虽然能够全局感知，但参数量和计算量极大，无法在移动或边缘设备广泛部署.针对以上问题，提出一种融合CNN和Transformer的人脸欺骗检测模型，旨在保持人脸全局和局部特征提取能力的前提下，实现参数量和准确度的平衡.首先，裁剪选取局部人脸图像作为输入，有效避免过拟合现象；其次，设计基于坐标注意力的特征提取模块；最后，设计融合CNN和Transformer模块，通过局部全局局部的信息交换实现图像局部特征和全局特征的提取.实验结果表明，该模型在CASIASURF(Depth模态)数据集上获得了99.31%的准确率以及0.54%的平均错误率；甚至在CASIAFASD和ReplayAttack这2个数据集上实现了零错误率，且模型参数量仅0.59MB，远小于Transformer系列模型.

关键词: 人脸欺骗检测, CNN, Transformer, 模型融合, 注意力机制

Abstract: In the field of face antispoofing, the methods based on Convolutional Neural Network (CNN) can learn feature representation with fewer parameters, yet their receptive fields remain local. In contrast, Transformerbased methods offer global perception but entail an impractical volume of parameters and computations, hindering widespread deployment on mobile or edge devices. To address these challenges, this paper proposed a face spoofing detection model that integrates CNN and Transformer, aiming to achieve a balance between the amount of parameters and accuracy while maintaining the ability to extract global and local features. Firstly, local face images are cropped and selected as input to effectively avoid overfitting. Secondly, the feature extraction module based on coordinate attention is designed. Finally, the fusion of CNN and Transformer modules are designed to extract local and global features of images through localgloballocal information exchange. The experimental results show that the model achieved an accuracy of 99.31% and an average error rate of 0.54% on the CASIASURF (Depth modality) dataset; Moreover zero error rate is achieved on the CASIAFASD and ReplayAttack datasets, and the model parameters are only 0.59MB, much smaller than the Transformer series models.

Key words: face spoofing detection, CNN, Transformer, model fusion, attention mechanisy

中图分类号:

TP183

黄灵, 何希平, 贺丹, 杨楚天, 旷奇弦, . 融合卷积神经网络和Transformer的人脸欺骗检测模型[J]. 信息安全研究, 2024, 10(1): 25-.

参考文献

［1］Frischholz R W, Werner A. Avoiding replayattacks in a face recognitionsystenm using headpose estimation［C］ Proc of IEEE Int SOI Conf. Piscataway, NJ: IEEE, 2003: 234235［2］Schuckers S. Spoofing and antispoofing measures［J］. Information Security Technical Report, 2002, 7(4): 5662［3］Atoum Y, Liu Yaojie, Jourabloo A, et al. Face antispoofing using patch and depthbased CNNs［C］ Proc of IEEE Int Joint Conf on Biometrics. Piscataway, NJ: IEEE, 2018: 319328［4］Parkin A, Grinchuk O. Creating artificial modalities to solve RGB liveness［J］. arXiv preprint, arXiv:2006.16028, 2020［5］George A, Marcel S. Cross modal focal loss for RGBD face antispoofing［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2021: 78827891［6］Yu Zitong, Liu Ajian, Zhao Chenxu, et al. Flexiblemodal face antispoofing: A benchmark［J］. arXiv preprint, arXiv:2202.08192, 2022［7］刘龙庚, 任宇, 王莉. 基于多模态与多尺度融合的抗欺骗人脸检测算法研究［J］. 信息安全研究, 2022, 8(5): 513520［8］Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale［J］. arXiv preprint, arXiv:2010.11929, 2020［9］George A, Marcel S. On the effectiveness of vision transformers for zeroshot face antispoofing［C］ Proc of 2021 IEEE Int Joint Conf on Biometrics (IJCB). Piscataway, NJ: IEEE, 2021: 18［10］Huang Hsinping, Sun Deqing, Liu Yaojie, et al. Adaptive transformers for robust fewshot crossdomain face antispoofing［J］. arXiv preprint, arXiv: 2203.12175, 2022［11］Mehta S, Rastegari M. MobileViT: Lightweight, generalpurpose, and mobilefriendly vision transformer［J］. arXiv preprint, arXiv: 2110. 02178, 2021［12］Ge Tao, Wei Furu. EdgeFormer: A parameterefficient transformer for ondevice seq2seq generation［J］. arXiv preprint, arXiv:2202.07959, 2022［13］Zhang Zhiwei, Yan Junjie, Liu Sifei, et al. A face antispoofing database with diverse attacks［C］ Proc of the 5th IEEE IAPR Int Conf on Biometrics (ICB). Piscataway, NJ: IEEE, 2012: 2631［14］Chingovska I, Anijos A, Marcel S. On the effectiveness of local binary patterns in face antispoofing［C］ Proc of the 2012 Int Conf of Biometrics Special Interest Group (BIOSIG). Piscataway, NJ: IEEE, 2012: 17［15］Zhang Shifeng, Wang Xiaobo, Liu Ajian, et al. A dataset and benchmark for largescale multimodal face antispoofing［C］ Proc of the 2019 IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 919928［16］Patel K, Bur A M, Li Fengjun, et al. Aggregating global features into local vision transformer［J］. arXiv preprint, arXiv: 2201.12903, 2022［17］Hu Jie, Shen Li, Sun Gang. Squeezeandexcitation networks［C］ Proc of the 2018 IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 71327141［18］Han Kai, Wang Yunhe, Tian Qi, et al. Ghost Net: More features from cheap operations［C］ Proc of the 2020 IEEECVF Conf on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2020: 15771586［19］He Dan, He Xiping, Yuan Rui, et al. Lightweight networkbased multimodal feature fusion for face antispoofing［J］. The Visual Computer, 2022, 39(4): 14231435［20］Sandler M, Howrd A, Zhu Menglong, et al. MobileNetV2: Inverted residuals and linear bottlenecks［C］ Proc of the 2018 IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 45104520［21］Hou Qibin, Zhou Daquan, Feng Jiashi. Coordinate attention for efficient mobile network design［C］ Proc of Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2021: 1371313722［22］He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition［C］ Proc of the IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770778［23］Ma Ningning, Zhang Xiangyu, Zheng Haitao, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design［C］ Proc of the 2018 European Conf on Computer Vision (ECCV). Berlin: Springer, 2018: 116131［24］Zhang Peng, Zou Fuhao, Wu Zhiwen, et al. FeatherNets: Convolutional neural networks as light as feather for face antispoofing［C］ Proc of the 2019 IEEECVF Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 15741583［25］Komulainen J, Hadid A, Matti P. Face spoofing detection using dynamic texture［C］ Proc of Asian Conf on Computer Vision. Berlin: Springer, 2012: 56［26］Li Lei, Feng Xiaoyi, Jiang Xiaoyue, et al. Face antispoofing via deep local binary patterns［C］ Proc of Int Conf on Image Processing. Piscataway, NJ: IEEE, 2017: 101105［27］Khammari M. Robust face antispoofing using CNN with LBP and WLD［J］. IET Image Processing, 2019, 13(11): 18801884［28］Ning Xin, Li Weijun, Wei Meili, et al. Face antispoofing based on deep stack generalization networks［C］ Proc of the Int Conf on Pattern Recognition Applications and Methods. Berlin: Springer, 2018: 317323［29］栾晓, 李晓双. 基于多特征融合的人脸活体检测算法［J］.计算机科学, 2021, 48(S2): 409415［30］Li Haoliang, He Peisong, Wang Shiqi, et al. Learning generalized deep feature representation for face antispoofing［J］. IEEE Trans on Information Forensics & Security, 2018, 13(10): 26392652［31］Chingovska I, Anijos A, Marcel S. On the effectiveness of local binary patterns in face antispoofing［C］ Proc of the 2012 Int Conf of Biometrics Special Interest Group (BIOSIG). Piscataway, NJ: IEEE, 2012: 17［32］Tirunagari S, Poh N, Windridge D, et al. Detection of face spoofing using visual dynamics［J］. IEEE Trans on Information Forensics & Security, 2015, 10(4): 762777［33］Jourabloo A, Liu Yaojie, Liu Xiaoming. Face despoofing: Antispoofing via noise modeling［C］ Proc of European Conf on Computer Vision. 2018: 290306［34］Yu Zitong, Zhao Chenxu, Wang Zezheng, et al. Searching central difference convolutional networks for face antispoofing［C］ Proc of the 2020 IEEE Conf on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2020: 52945304

[1]	闫一非, 文斌, 张逢, . 基于图神经网络的智能合约源码漏洞检测[J]. 信息安全研究, 2023, 9(E1): 55-.
[2]	王鹏钧, 常晓敏, 杨基宏, 刘海敏, . 一种多特征融合的图像鉴伪方法[J]. 信息安全研究, 2023, 9(E1): 153-.
[3]	叶水欢, 葛寅辉, 陈波, 于泠, . 基于ELMoTextCNN的网络欺凌检测模型[J]. 信息安全研究, 2023, 9(9): 868-.
[4]	蒋明, 张宗凯, 刘熙尧, 郭标, 胡家馨, 张硕, . 基于多注意力机制的孪生网络图像隐写分析方法[J]. 信息安全研究, 2023, 9(6): 573-.
[5]	杨哲, 陈应虎, . 赌博网站自动识别技术研究[J]. 信息安全研究, 2023, 9(5): 440-.
[6]	陈颖, 林雨衡, 王志强, 都迎迎, 文津, . 基于Transformer的安卓恶意软件多分类模型[J]. 信息安全研究, 2023, 9(12): 1138-.
[7]	喻晓伟, 陈丹伟, . 基于注意力机制的图神经网络加密流量分类研究[J]. 信息安全研究, 2023, 9(1): 13-.
[8]	周梓馨, 张功萱, 寇小勇, 杨威. 一种基于自注意力机制的深度学习侧信道攻击方法[J]. 信息安全研究, 2022, 8(8): 812-.
[9]	陈传涛潘丽敏罗森林 . 基于抽象语法树压缩编码的漏洞检测方法[J]. 信息安全研究, 2022, 8(1): 35-.
[10]	刘思琴冯胥睿瑞. 基于BERT的文本情感分析[J]. 信息安全研究, 2020, 6(3): 220-227.
[11]	邓旭冉李灵慧唐胜张勇东. 图像内容自动描述技术综述[J]. 信息安全研究, 2019, 5(11): 988-992.