基于多级度量差值的神经网络后门检测方法

信息安全研究 ›› 2023, Vol. 9 ›› Issue (6): 587-.

• 人工智能的安全风险与隐私保护专题 • 上一篇下一篇

基于多级度量差值的神经网络后门检测方法

刘亦纯, 张光华, 宿景芳

(河北科技大学信息科学与工程学院石家庄050018)
(河北省智能物联网技术创新中心(河北科技大学)石家庄050018)

出版日期:2023-06-04 发布日期:2023-06-03

Neural Network Backdoor Detection Method Based on Multilevel Measurement Difference

Online:2023-06-04 Published:2023-06-03

摘要/Abstract

摘要： 深度神经网络在各类任务中都展现出良好的性能，但由于深度学习模型缺乏透明性和不可解释性，在推理阶段触发恶意攻击者设定的后门时模型出现异常行为而导致性能下降.针对上述问题，提出了一种基于多级度量差值的后门检测方案(backdoor detection scheme based on multilevel measurement difference, MultMeasure).首先对源模型和被恶意注入后门的授权模型对抗攻击生成测试用例；并设置白盒和黑盒2种度量计算测试用例；最后通过统计阈值计算差值判断模型是否被注入后门.实验表明，MultMeasure在植入特洛伊木马模型的后门攻击场景下，并在多触发器和隐形触发器下评估性能良好，相较近年来已有的检测方案，MultMeasure具有更好的有效性和稳定性.

关键词: 神经网络, 深度学习, 多级度量, 特洛伊木马, 后门攻击

Abstract: The deep neural network has achieved advanced performance in various tasks. However, due to the lack of transparency and unexplainable of the deep learning model, the model will show abnormal behavior when the backdoor set by the malicious attacker is triggered in the reasoning stage, and the performance will be degraded. To solve the above problems, this paper proposes a Backdoor Detection Scheme Based on Multilevel Measurement Difference (MultMeasure). Test cases are generated against the source model and the authorization model maliciously injected backdoor. Two measures, white box and black box, are set to calculate test cases. Finally, the statistical threshold is used to calculate the difference to determine whether the model is injected backdoor. Experiments show that MultMeasure proposed in this paper is tested in the backdoor attack scenario implanted with Trojan Horse model, and performance evaluation is good under multiple triggers and invisible triggers. Compared with the existing detection schemes in recent years, MultMeasure has better effectiveness and stability.

Key words: neural network, deep learning, multilevel measurement, Trojan Horse, backdoor attacks

刘亦纯, 张光华, 宿景芳. 基于多级度量差值的神经网络后门检测方法[J]. 信息安全研究, 2023, 9(6): 587-.

[1]	蒋明, 张宗凯, 刘熙尧, 郭标, 胡家馨, 张硕, . 基于多注意力机制的孪生网络图像隐写分析方法[J]. 信息安全研究, 2023, 9(6): 573-.
[2]	王志强, 都迎迎, 林雨衡, 陈旭东, . 基于文本关键词的对抗样本生成技术研究[J]. 信息安全研究, 2023, 9(4): 338-.
[3]	陆明远, 侯春燕, 王劲松. 基于 Softplus 函数的神经网络的 Reluplex 算法验证研究[J]. 信息安全研究, 2022, 8(9): 917-.
[4]	王中华, 徐杰, 韩健, 臧天宁. 基于卷积神经网络的恶意区块链域名检测方法[J]. 信息安全研究, 2022, 8(8): 760-.
[5]	颜祺, 牛彦杰, 陈国友. 基于深度学习的信息高保密率传输方法[J]. 信息安全研究, 2022, 8(8): 793-.
[6]	周梓馨, 张功萱, 寇小勇, 杨威. 一种基于自注意力机制的深度学习侧信道攻击方法[J]. 信息安全研究, 2022, 8(8): 812-.
[7]	刘小乐, 方勇, 黄诚, 许益家. 基于深度图卷积神经网络的Exploit Kit攻击活动检测方法[J]. 信息安全研究, 2022, 8(7): 685-.
[8]	金志刚周峻毅何晓勇. 面向自然语言处理领域的对抗攻击研究与展望[J]. 信息安全研究, 2022, 8(3): 202-.
[9]	桓琦, 谢小权, 郭敏, 曾颖明, . 针对深度强化学习导航的物理对抗攻击方法[J]. 信息安全研究, 2022, 8(3): 212-.
[10]	梁晨, 王利斌, 李卓群, 薛源, . 生成式对抗网络技术与研究进展[J]. 信息安全研究, 2022, 8(3): 235-.
[11]	张煜之, 王锐芳, 朱亮, 赵坤园, 刘梦琪, . 深度伪造生成和检测技术综述[J]. 信息安全研究, 2022, 8(3): 258-.
[12]	胡韵, 刘嘉驹, 李春国, . 一种基于差分隐私的可追踪深度学习分类器[J]. 信息安全研究, 2022, 8(3): 277-.
[13]	石波, 于然, 陈志浩, 朱健, . 工业控制系统安全态势评估与预测方案[J]. 信息安全研究, 2022, 8(2): 145-.
[14]	黄屿璁, 张潮, 吕鑫, 曾涛, 王鑫元, 丁辰龙, . 基于深度学习的网络入侵检测研究综述[J]. 信息安全研究, 2022, 8(12): 1163-.
[15]	徐金才任民李琦孙哲南. 图像对抗样本的安全性研究概述[J]. 信息安全研究, 2021, 7(4): 294-309.

基于多级度量差值的神经网络后门检测方法

Neural Network Backdoor Detection Method Based on Multilevel Measurement Difference

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics