融合二次特征提取和自蒸馏的流量异常检测方法

信息安全研究 ›› 2024, Vol. 10 ›› Issue (12): 1082-.

• 综合安全防御体系专题 • 上一篇下一篇

融合二次特征提取和自蒸馏的流量异常检测方法

陈万志1赵林1王天元2

1(辽宁工程技术大学软件学院辽宁葫芦岛125105)
2(国网辽宁省电力有限公司营口供电公司辽宁营口115002)

出版日期:2024-12-25 发布日期:2024-12-25
通讯作者: 陈万志博士，副教授.主要研究方向为人工智能与智能信息处理、网络与信息安全、工控软件与数据分析. chenwanzhi@lntu.edu.cn
作者简介:陈万志博士，副教授.主要研究方向为人工智能与智能信息处理、网络与信息安全、工控软件与数据分析. chenwanzhi@lntu.edu.cn 赵林硕士研究生.主要研究方向为网络安全. 18242194878@163.com 王天元工程师.主要研究方向为电力安全与审计. 654771112@qq.com

Traffic Anomaly Detection Method by Secondorder Feature

Chen Wanzhi1, Zhao Lin1, and Wang Tianyuan2

1(College of Software, Liaoning Technical University, Huludao, Liaoning 125105)
2(State Grid Liaoning Electric Power Supply Co., Ltd., Yingkou, Liaoning 115002)

Online:2024-12-25 Published:2024-12-25

摘要/Abstract

摘要： 针对深度学习模型在处理非平衡的海量高维流量数据时对少数类攻击流量检测率低的问题，提出一种融合二次特征提取和自蒸馏的流量异常检测方法.首先，采用隔离森林(isolation forest, iForest)去除正常类样本中的离群点，训练改进的卷积去噪编码器(convolutional denoising autoencoder, CDAE)，减少数据中噪声和离群点对模型训练时的影响，得到原始特征的低维增强表示.其次，借助ADASYN在去除离群点的数据集上合成少数类攻击样本，解决数据失衡问题.然后，再利用iForest清除生成新样本中的离群点得到新数据集，利用训练好的CDAE对新数据集进行1次特征提取，提取的特征作为基于自蒸馏的ResNet模型输入完成2次特征提取.最后，通过组合训练好的CDAE和ResNet模型实现对异常流量的精准识别.该方法在NSLKDD数据集上五分类准确率和F1分数最高分别达到91.52%和92.05%.实验结果表明，与现有的方法相比，该方法能够有效提升对少数攻击流量的检测率.

关键词: 流量异常检测, 卷积去噪自编码器, 自蒸馏, 隔离森林, 自适应合成采样

Abstract: A method is proposed to address the challenge of low detection rates for minority class attack traffic in deep learning models when dealing with imbalanced massive highdimensional network traffic data. Firstly, the isolation forest (iForest) is employed to remove outliers from normal class samples, used for training an enhanced Convolutional Denoising Autoencoder (CDAE) to mitigate the impact of noise and outliers on model training, resulting in a lowdimensional enhanced representation of the original features. Secondly, leveraging ADASYN on the outlierfree dataset to synthetically generate minority class attack samples, thereby resolving the data imbalance issue. Subsequently, using iForest to clean the newly generated samples from outliers, a new dataset is obtained. Employing the pretrained CDAE on this dataset achieves a firstround feature extraction, and the extracted features serve as input for a selfdistilled ResNet model to perform secondorder feature extraction. Finally, precise identification of anomalous traffic is accomplished by combining the trained CDAE and ResNet models. The method achieves the highest fiveclass accuracy and F1 score of 91.52% and 92.05%, respectively, on the NSLKDD dataset. Experimental results demonstrate that, compared to existing methods, this approach effectively enhances the detection rates for minority class attack traffic.

Key words: traffic anomaly detection, convolutional denoising autoencoder, selfdistillation, isolation forest, adaptive synthetic sampling

中图分类号:

TP393

陈万志, 赵林, 王天元, . 融合二次特征提取和自蒸馏的流量异常检测方法[J]. 信息安全研究, 2024, 10(12): 1082-.

参考文献

［1］Zhang H G, Mu Y. Cyberspace security［J］. China Communication, 2016, 13(11): 6869［2］Su Y, Qi K, Di C, et al. Learning automata based feature selection for network traffic intrusion detection［C］ Proc of the 3rd IEEE Int Conf on Data Science in Cyberspace. Piscataway, NJ: IEEE, 2018: 622627［3］Li J, Cheng K, Wang S, et al. Feature selection: A data perspective［J］. ACM Computing Surveys, 2017, 50(6): 145［4］唐玺博, 张立民, 钟兆根. 基于ADASYN与改进残差网络的入侵流量检测识别［J］. 系统工程与电子技术, 2022, 44(12): 38503862［5］苏新, 田天, Gong Ziyang, 等. 基于异常行为的海洋气象传感网的入侵检测方法研究［J］. 通信学报, 2023, 44(7): 8699［6］李艳霞, 柴毅, 胡友强, 等. 不平衡数据分类方法综述［J］. 控制与决策, 2019, 34(4): 673688［7］何红艳, 黄国言, 张炳, 等. 基于多种特征选择策略的入侵检测模型研究［J］. 信息安全研究, 2021, 7(3): 225232［8］Stiawan D, Idris M Y B, Bamhdi A M, et al. CICIDS—2017 dataset feature analysis with information gain for anomaly detection［J］. IEEE Access, 2020, 8: 132911132921［9］李郅琴, 杜建强, 聂斌, 等. 特征选择方法综述［J］. 计算机工程与应用, 2019, 55(24): 1019［10］董书琴, 张斌. 基于深度特征学习的网络流量异常检测方法［J］. 电子与信息学报, 2020, 42(3): 695703［11］尹梓诺, 马海龙, 胡涛. 基于联合注意力机制和一维卷积神经网络——双向长短期记忆网络模型的流量异常检测方法［J］. 电子与信息学报, 2023, 45(10): 37193728［12］石磊, 张吉涛, 高宇飞, 等. 基于Transformer与BiLSTM的网络流量入侵检测［J］. 计算机工程, 2023, 49(3): 3936, 57［13］Sadique F, Sengupta S. Modeling and analyzing attacker behavior in IoT botnet using temporal convolution network (TCN)［J］. Computers & Security, 2022, 117: 102714［14］Liu F T, Ting K M, Zhou Z H. Isolation forest［C］ Proc of the 8th IEEE Int Conf on Data Mining. Piscataway, NJ: IEEE, 2008: 413422［15］Zhang L, Song J, Gao A, et al. Be your own teacher: Improve the performance of convolutional neural networks via self distillation［C］ Proc of the IEEECVF Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2019: 37133722［16］Lin M, Chen Q, Yan S. Network in network［J］. arXiv preprint, arXiv:1312.4400, 2013［17］Dhanabal L, Shantharajah S P. A study on NSLKDD dataset for intrusion detection system based on classification algorithms［J］. International Journal of Advanced Research in Computer and Communication Engineering, 2015, 4(6): 446452［18］Cui J, Zong L, Xie J, et al. A novel multimodule integrated intrusion detection system for highdimensional imbalanced data［J］. Applied Intelligence, 2023, 53(1): 272288［19］梁欣怡, 行鸿彦, 侯天浩. 基于自监督特征增强的CNNBiLSTM网络入侵检测方法［J］. 电子测量与仪器学报, 2022, 36(10): 6573［20］Zhang G, Wang X, Li R, et al. Network intrusion detection based on conditional Wasserstein generative adversarial network and costsensitive stacked autoencoder［J］. IEEE Access, 2020, 8: 190431190447

融合二次特征提取和自蒸馏的流量异常检测方法

Traffic Anomaly Detection Method by Secondorder Feature

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 1

编辑推荐

Metrics