信息安全研究 ›› 2023, Vol. 9 ›› Issue (6): 587-.

• 人工智能的安全风险与隐私保护专题 • 上一篇    下一篇

基于多级度量差值的神经网络后门检测方法

刘亦纯, 张光华, 宿景芳   

  1. (河北科技大学信息科学与工程学院石家庄050018)
    (河北省智能物联网技术创新中心(河北科技大学)石家庄050018)
  • 出版日期:2023-06-04 发布日期:2023-06-03

Neural Network Backdoor Detection Method Based on Multilevel  Measurement Difference

  • Online:2023-06-04 Published:2023-06-03

摘要: 深度神经网络在各类任务中都展现出良好的性能,但由于深度学习模型缺乏透明性和不可解释性,在推理阶段触发恶意攻击者设定的后门时模型出现异常行为而导致性能下降.针对上述问题,提出了一种基于多级度量差值的后门检测方案(backdoor detection scheme based on multilevel measurement difference, MultMeasure).首先对源模型和被恶意注入后门的授权模型对抗攻击生成测试用例;并设置白盒和黑盒2种度量计算测试用例;最后通过统计阈值计算差值判断模型是否被注入后门.实验表明,MultMeasure在植入特洛伊木马模型的后门攻击场景下,并在多触发器和隐形触发器下评估性能良好,相较近年来已有的检测方案,MultMeasure具有更好的有效性和稳定性.

关键词: 神经网络, 深度学习, 多级度量, 特洛伊木马, 后门攻击

Abstract: The deep neural network has achieved advanced performance in various tasks. However, due to the lack of transparency and unexplainable of the deep learning model, the model will show abnormal behavior when the backdoor set by the malicious attacker is triggered in the reasoning stage, and the performance will be degraded. To solve the above problems, this paper proposes a Backdoor Detection Scheme Based on Multilevel Measurement Difference (MultMeasure). Test cases are generated against the source model and the authorization model maliciously injected backdoor. Two measures, white box and black box, are set to calculate test cases. Finally, the statistical threshold is used to calculate the difference to determine whether the model is injected backdoor. Experiments show that MultMeasure proposed in this paper is tested in the backdoor attack scenario implanted with Trojan Horse model, and performance evaluation is good under multiple triggers and invisible triggers. Compared with the existing detection schemes in recent years, MultMeasure has better effectiveness and stability.

Key words: neural network, deep learning, multilevel measurement, Trojan Horse, backdoor attacks