Research on Adaptive Hierarchical Neural Network Backdoor Defense Method

Journal of Information Security Reserach ›› 2026, Vol. 12 ›› Issue (4): 359-.

Previous Articles Next Articles

Research on Adaptive Hierarchical Neural Network Backdoor Defense Method

Xu Yuanping, Ma Weifeng, and Zhang Yulai

(School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023)

Online:2026-04-07 Published:2026-04-07

自适应层级化神经网络后门防御方法研究

徐媛屏马伟锋张宇来

(浙江科技大学信息与电子工程学院杭州310023)

通讯作者: 张宇来博士，副教授，硕士生导师.主要研究方向为人工智能. zhangyulai@zust.edu.cn
作者简介:徐媛屏硕士研究生.主要研究方向为人工智能安全. xyp_xxx_00@163.com 马伟锋博士，副教授，硕士生导师.主要研究方向为计算机应用. mawf@zust.edu.cn 张宇来博士，副教授，硕士生导师.主要研究方向为人工智能. zhangyulai@zust.edu.cn
基金资助:
国家自然科学青年科学基金项目(61803337)

Abstract

Abstract: Backdoor attacks force the deep learning models to output a preset result at a specific inputs by implanting a covert trigger patterns into the training data, which seriously threatens the security of the model. Traditional defense methods (such as pruning and finetuning) are difficult to balance defense effect and model performance due to the partial overlap between the posterior portal neurons and the normal neurons. To this challenge, an adaptive hierarchical neural network backdoor defense (AHBD) method is proposed, which locates the backdoor through gradient direction consistency analysis, and designs adaptive defense strategies based on the functional characteristics of different levels of neural networks. Experiments show that AHBD significantly reduces the attack success rate (ASR decreases to 2.63% and 1.71%, respectively) on the CIFAR10 and GTSRB datasets, while maintaining the original classification accuracy of the model (ACC decreases by less than 1%), which is better than the existing mainstream defense methods.

Key words: deep learning, deep neural network, backdoor attack, backdoor defense, artificial intelligence security

摘要： 后门攻击通过向训练数据植入隐蔽触发模式，迫使深度学习模型在特定输入时输出预设结果，严重威胁模型安全.传统防御方法(如剪枝和微调)因后门神经元与正常神经元部分重叠，难以兼顾防御效果与模型性能.为此，提出一种自适应层级化神经网络后门防御(adaptive hierarchical neural network backdoor defense, AHBD)方法.通过梯度方向一致性分析定位后门，并基于神经网络不同层级的功能特性设计自适应的防御策略，同时引入对抗训练进一步破坏后门激活路径，提升模型泛化能力.实验表明，AHBD在CIFAR10与GTSRB数据集上显著降低攻击成功率(平均攻击成功率ASR分别下降至2.63%和1.71%)，同时保持模型原始分类准确率(平均分类准确率ACC下降幅度低于1%)，优于现有主流防御方法.

关键词: 深度学习, 深度神经网络, 后门攻击, 后门防御, 人工智能安全

CLC Number:

TP309.2

徐媛屏, 马伟锋, 张宇来, . 自适应层级化神经网络后门防御方法研究[J]. 信息安全研究, 2026, 12(4): 359-.

References

［1］Zhang Shaobo, Pan Yimeng, Liu Qin, et al .Backdoor attacks and defenses targeting multidomain AI models: A comprehensive review［J］. ACM Computing Surveys, 2025, 57(4): 135［2］Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks［J］. arXiv preprint, arXiv:1312.6199, 2013［3］Gu T, DolanGavitt B, Garg S. Badnets: Identifying vulnerabilities in the machine learning model supply chain［J］. arXiv preprint, arXiv:1708.06733, 2017［4］Liu Y, Ma S, Aafer Y, et al. Trojaning attack on neural networks［C］ Proc of the 25th Annual Network and Distributed System Security Symposium (NDSS 2018). Digco, CA: Internet Society, 2018［5］Chen X, Liu C, Li B, et al. Targeted backdoor attacks on deep learning systems using data poisoning［J］. arXiv preprint, arXiv:1712.05526, 2017［6］Nguyen T A, Tran A. Inputaware dynamic backdoor attack［J］. Advances in Neural Information Processing Systems, 2020, 33: 34543464［7］Nguyen A, Tran A. Wanetimperceptible warpingbased backdoor attack［J］. arXiv preprint, arXiv:2102.10369, 2021［8］Zhu M, Wei S, Shen L, et al. Enhancing finetuning based backdoor defense with sharpnessaware minimization［C］ Proc of IEEECVF Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2023: 44664477［9］Li Y, Lyu X, Ma X, et al. Reconstructive neuron pruning for backdoor defense［C］ Proc of Int Conf on Machine Learning. New York: PMLR, 2023: 1983719854［10］Wu B, Chen H, Zhang M, et al. Backdoorbench: A comprehensive benchmark and analysis of backdoor learning［J］. International Journal of Computer Vision, 2025, 133(8): 57005787［11］Li Y, Lyu X, Koren N, et al. Neural attention distillation: Erasing backdoor triggers from deep neural networks［J］. arXiv preprint, arXiv:2101.05930, 2021［12］郑嘉熙, 陈伟, 尹萍, 等. 基于可解释性的不可见后门攻击研究［J］. 信息安全研究, 2025, 11(1): 2127［13］Wang B, Yao Y, Shan S, et al. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks［C］ Proc of 2019 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 707723［14］Liu K, DolanGavitt B, Garg S. Finepruning: Defending against backdooring attacks on deep neural networks［C］ Porc of International Symp on Research in Attacks, Intrusions, and Defenses. Berlin: Springer, 2018: 273294［15］Zheng R, Tang R, Li J, et al. Datafree backdoor removal based on channel lipschitzness［C］ Proc of European Conference on Computer Vision. Berlin: Springer, 2022: 175191［16］Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks［J］. Proceedings of the National Academy of Sciences, 2017, 114(13): 35213526

[1]	. Research on Log Anomaly Detection Method Integrating Semantic Features [J]. Journal of Information Security Reserach, 2026, 12(4): 383-.
[2]	. Anomaly Traffic Detection Based on Improved Bidirectional TCN Model in Software Defined Network [J]. Journal of Information Security Reserach, 2026, 12(4): 303-.
[3]	. Adaptive Gaussian Mixturebased Federated Learning Backdoor Defense Approach [J]. Journal of Information Security Reserach, 2026, 12(4): 348-.
[4]	. Federated Learning Backdoor Attack Based on Constrained Perturbation and Loss Regulation [J]. Journal of Information Security Reserach, 2026, 12(3): 210-.
[5]	. Research on Twostage Network Intrusion Detection Method for Outofdistribution Traffic Data [J]. Journal of Information Security Reserach, 2026, 12(3): 265-.
[6]	. Smart Contract Vulnerabilities Based on Differential Evolutionary Algorithms and Solution Time Prediction Detection#br# [J]. Journal of Information Security Reserach, 2026, 12(1): 24-.
[7]	. A Survey on Backdoor Attacks and Defenses in Federated Learning [J]. Journal of Information Security Reserach, 2025, 11(9): 778-.
[8]	. Internet of Things Intrusion Detection Model Based on Federated Learning [J]. Journal of Information Security Reserach, 2025, 11(9): 788-.
[9]	. A Covert Backdoor Attack Method in Fewshot Class Incremental Learning [J]. Journal of Information Security Reserach, 2025, 11(9): 797-.
[10]	. Fake News Detection Model Based on Crossmodal Attention Mechanism and#br# Weaksupervised Contrastive Learning#br# [J]. Journal of Information Security Reserach, 2025, 11(8): 693-.
[11]	. Encrypted Traffic Detection Method Based on Knowledge Distillation [J]. Journal of Information Security Reserach, 2025, 11(8): 702-.
[12]	. Deep Learningbased Method for Encrypted Website Fingerprinting [J]. Journal of Information Security Reserach, 2025, 11(4): 304-.
[13]	. Research on Dataenhanced Multimodal False Information #br# Detection Framework#br# [J]. Journal of Information Security Reserach, 2025, 11(4): 377-.
[14]	. Privacypreserving Federated Learning Research Based on #br# Confused Modulo Projection Homomorphic Encryption#br# [J]. Journal of Information Security Reserach, 2025, 11(3): 198-.
[15]	. Design of Adversarial Attack Scheme Based on YOLOv8 Object Detector [J]. Journal of Information Security Reserach, 2025, 11(3): 221-.

Research on Adaptive Hierarchical Neural Network Backdoor Defense Method

自适应层级化神经网络后门防御方法研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics