融合对比学习和特征选择的入侵检测模型

信息安全研究 ›› 2024, Vol. 10 ›› Issue (5): 453-.

融合对比学习和特征选择的入侵检测模型

陈虹1程明佳1金海波1武聪2姜朝议1

1(辽宁工程技术大学软件学院辽宁葫芦岛125105)
2(辽宁工程技术大学科学技术研究院辽宁阜新123099)

出版日期:2024-05-20 发布日期:2024-05-15
通讯作者: 陈虹硕士，副教授.主要研究方向为信息安全和网络安全. chh3188@163.com
作者简介:陈虹硕士，副教授.主要研究方向为信息安全和网络安全. chh3188@163.com 程明佳硕士研究生.主要研究方向为网络安全. chengmingjia1999@163.com 金海波博士，副教授.主要研究方向为复杂系统优化维护、系统可靠性. jinhaibo@lntu.edu.cn 武聪博士，讲师.主要研究方向为数据分析与智能决策. fxwucong@163.com

Intrusion Detection Model Incorporating Contrastive Learning and Feature Selection

Chen Hong1, Cheng Mingjia1, Jin Haibo1, Wu Cong2, and Jiang Chaoyi1

1(College of Software, Liaoning Technical University, Huludao, Liaoning 125105)
2(Institute of Science and Technology, Liaoning Technical University, Fuxin, Liaoning 123099)

Online:2024-05-20 Published:2024-05-15

摘要/Abstract

摘要： 入侵检测系统可以主动识别恶意流量，是保护网络安全的重要工具.针对网络流量中存在的冗余特征以及现有的入侵检测算法在特征选择过程中存在的不足，提出一种融合对比学习和特征选择的入侵检测模型(contrastive learning and feature selection, CLFS).利用皮尔逊相关系数(Pearson correlation coefficient, PCCs)对预处理后的网络流量进行相关性分析，过滤掉相似特征；使用自编码器(autoencoder, AE)进行深度特征提取，在提取阶段融入对比学习，减少类间相似性，将提取的新特征和过滤后的特征融合，得到表征能力更强的特征集；利用改进的鸽群算法进行包裹特征选择，根据贝叶斯分类器的性能选择最优特征子集，提高分类精度.在NSLKDD，UNSWNB15这2个数据集的实验结果表明，CLFS模型可以提升分类精度并减少处理时间，在2个数据集上的2分类实验准确率分别为90.45%和88.52%，分类处理时间大约减少为原来的一半.

Abstract: Intrusion detection systems play a vital role in actively identifying malicious traffic as a crucial tool for safeguarding network security. To address the issue of redundant features in network traffic and the shortcomings of existing intrusion detection algorithms during the feature selection process, we propose an intrusion detection model CLFS(contrastive learning and feature selection) The model utilizes the Pearson correlation coefficient (PCCs) for analyzing the correlation of preprocessed network traffic and filtering out similar features. Autoencoder (AE) is used for deep feature extraction and in the extraction stage, comparative learning is integrated to reduce the similarity between classes. The extracted new features and filtered features are fused to obtain a feature set with stronger representation ability. To increase classification accuracy, the wrapper feature selection is conducted using the enhanced pigeon swarm algorithm, and the best feature subset is chosen based on how well the Bayesian classifier performs. The experimental results on NSLKDD and UNSWNB15 datasets demonstrate that the CLFS model effectively improves the classification accuracy and reduces the processing time. The accuracy of binary classification experiments on both datasets is 90.45% and 88.52%, respectively, with the classification processing time approximately halved.Key wordscontrastive learning; Pearson correlation coefficient; pigeon inspired optimizer; feature extraction; feature selection

Key words: Contrastive Learning, pearson correlation coefficient, pigeon inspired optimizer, feature extraction, feature selection

中图分类号:

TP393

陈虹, 程明佳, 金海波, 武聪, 姜朝议, . 融合对比学习和特征选择的入侵检测模型[J]. 信息安全研究, 2024, 10(5): 453-.

参考文献

［1］Ahmad Z, Shahid Khan A, Wai Shiang C, et al. Network intrusion detection system: A systematic study of machine learning and deep learning approaches［J］. Transactions on Emerging Telecommunications Technologies, 2021, 32(1): e4150［2］黄屿璁, 张潮, 吕鑫, 等. 基于深度学习的网络入侵检测研究综述［J］. 信息安全研究, 2022, 8(12): 11631177［3］ Yao R, Wang N, Liu Z, et al. Intrusion detection system in the smart distribution network: A feature engineering based AELightGBM approach［J］. Energy Reports, 2021, 7: 353361［4］ Di Mauro M, Galatro G, Fortino G, et al. Supervised feature selection techniques in network intrusion detection: A critical review［J］. Engineering Applications of Artificial Intelligence, 2021, 101: 104216［5］ Sarhan M, Layeghy S, Moustafa N, et al. Feature extraction for machine learningbased intrusion detection in IoT networks［JOL］. Digital Communications and Networks, 2022 ［20231210］. https:doi.org10.1016j.dcan.2022.08.012［6］Fatani A, Dahou A, AlQaness M A A, et al. Advanced feature extraction and selection approach using deep learning and Aquila optimizer for IoT intrusion detection system［J］. Sensors, 2022, 22(1): 140［7］ Rao K N, Rao K V, Pvgd P R. A hybrid intrusion detection system based on sparse autoencoder and deep neural network［J］. Computer Communications, 2021, 180: 7788［8］ LopezMartin M, SanchezEsguevillas A, Arribas J I, et al. Supervised contrastive learning over prototypelabel embeddings for network intrusion detection［J］. Information Fusion, 2022, 79: 200228［9］ Maldonado J, Riff M C, Neveu B. A review of recent approaches on wrapper feature selection for intrusion detection［JOL］. Expert Systems with Applications, 2022 ［20240427］. https:doi.org10.1016j.eswa.2022.116822［10］AlYaseen W L, Idrees A K, Almasoudy F H. Wrapper feature selection method based differential evolution and extreme learning machine for intrusion detection system［J］. Pattern Recognition, 2022, 132: 108912［11］Alazab M, Khurma R A, Awajan A, et al. A new intrusion detection system based on MothFlame Optimizer algorithm［J］. Expert Systems with Applications, 2022, 210: 118439［12］王一丰, 郭渊博, 陈庆礼, 等. 基于对比学习的细粒度未知恶意流量分类方法［J］. 通信学报, 2022, 43(10): 1225［13］Yue Y, Chen X, Han Z, et al. Contrastive learning enhanced intrusion detection［J］. IEEE Transactions on Network and Service Management, 2022, 19(4): 42324247［14］Liu Q, Wang D, Jia Y, et al. A multitask based deep learning approach for intrusion detection［J］. KnowledgeBased Systems, 2022, 238: 107852［15］Alazzam H, Sharieh A, Sabri K E. A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer［J］. Expert Systems with Applications, 2020, 148: 113249［16］Alghanam O A, Almobaideen W, Saadeh M, et al. An improved PIO feature selection algorithm for IoT network intrusion detection system based on ensemble learning［J］. Expert Systems with Applications, 2023, 213: 118745［17］周杰英, 贺鹏飞, 邱荣发, 等. 融合随机森林和梯度提升树的入侵检测研究［J］. 软件学报, 2021, 32(10): 32543265［18］Kasongo S M. A deep learning technique for intrusion detection system using a recurrent neural networks based framework［J］. Computer Communications, 2023, 199: 113125［19］Kasongo S M. An advanced intrusion detection system for IIoT based on GA and tree based algorithms［J］. IEEE Access, 2021, 9: 11319911321

[1]	董勃, 罗森林, . 小数据集文本语义相似性分析模型的优化与应用[J]. 信息安全研究, 2023, 9(10): 980-.
[2]	文毅, 郭澍, 孔昊, 郭剑虹, . 针对密码芯片电磁辐射泄漏的特征选择方法[J]. 信息安全研究, 2022, 8(12): 1214-.
[3]	高大伟, 申杰, 沈学利, 王兆福, . 基于生物免疫原理的DDoS攻击检测方法研究[J]. 信息安全研究, 2022, 8(11): 1129-.
[4]	李晓明王文晖任琳琳晏涌陈兆玉沙芸刘学君. 基于强化学习的特征提取方法在攻击识别中的应用[J]. 信息安全研究, 2021, 7(4): 351-358.
[5]	何红艳黄国言张炳陈瑜. 基于多种特征选择策略的入侵检测模型研究[J]. 信息安全研究, 2021, 7(3): 225-232.
[6]	王柯林杨珂赵瑞哲辛丽玲汪秋云. 基于随机森林的抗混淆Android恶意应用检测[J]. 信息安全研究, 2021, 7(2): 126-135.
[7]	李曼车向北张宗包唐沁婷田源李剑. 软件定义网络中基于XGBoost算法的DDoS攻击检测[J]. 信息安全研究, 2021, 7(11): 1031-.
[8]	刘星. 融合局部语义信息的多模态舆情分析模型[J]. 信息安全研究, 2019, 5(4): 340-345.
[9]	刘正宵段丁阳唐志浩符天枢. 基于稳定风险特征选择的支付风险识别模型[J]. 信息安全研究, 2019, 5(10): 858-864.
[10]	郗桐金昊徐根炜周金岭. 基于卷积神经网络的Android恶意应用检测方法[J]. 信息安全研究, 2018, 4(8): 715-721.
[11]	祝鹏程陈洁黄诚刘强. 基于TF-IDF和随机森林算法的Web攻击流量检测方法研究[J]. 信息安全研究, 2018, 4(11): 1040-1045.
[12]	易楠. 基于语义分析的Webshell检测技术研究[J]. 信息安全研究, 2017, 3(2): 145-150.
[13]	冯亚玲. 基于系统调用的恶意软件检测技术研究[J]. 信息安全研究, 2016, 2(4): 367-371.
[14]	郝晨曦方勇. 基于频谱分析的PDF文件恶意代码检测方法[J]. 信息安全研究, 2016, 2(2): 166-171.
[15]	宋丹. 生物识别技术及其在金融支付安全领域的应用[J]. 信息安全研究, 2016, 2(1): 27-32.