Research on Intrusion Detection Model based on Multiple Feature Selection Strategies

Abstract

Abstract: Intrusion detection is an effective method to prevent host and network attacks. The use of intrusion detection systems makes up for the shortcomings of traditional firewall technology, signature authentication technology, and access control technology in terms of security protection. However, the mutual redundancy between the features of intrusion detection data samples interferes with the accuracy and efficiency of attack detection. The feature selection method can effectively reduce the dimension of data features and eliminate redundant features, select the optimal sub-features and improve the accuracy of network traffic anomaly detection. Based on this, this article first uses the K-means algorithm to extract typical data from the real traffic data set UNSW-NB15, generates a data set with typical data characteristics as the feature extraction data set, and then uses 9 different strategies for intrusion on the data set. The detection model has conducted network intrusion detection experiments. The experimental results show that the method can effectively detect and classify, and the accuracy of two classifications of normal traffic and malicious traffic is 88.27%, which is higher than other machine learning algorithms. In addition, the detection rate of attack types with less sample data is improved in the study of multi-category classification. The effectiveness of the method is verified and it is easy to use.

Key words: intrusion detection, feature selection, UNSW-NB15, recursive feature elimination (RFE), logistic regression (LR)

摘要： 入侵检测是防止主机和网络攻击的有效方法。入侵检测系统的使用弥补了传统防火墙技术、签名认证技术、访问控制技术在安全保护方面的不足。但是，由于入侵检测数据样本特征之间存在互冗余性干扰了攻击检测的准确性和效率。特征选择方法能有效降低数据特征的维度和消除冗余特征，选出最优特征子集并提高网络流量异常检测的准确率。基于此，本文首先使用Kmeans聚类算法在真实流量数据集UNSW-NB15提取典型数据，生成具有典型数据特征的数据集作为特征提取的数据集，随后在该数据集上分别使用了9种不同策略的入侵检测模型进行了网络入侵检测实验。实验结果表明，该方法能够进行有效检测和分类，正常流量、恶意流量二分类精度为88.27%，高于其他机器学习算法。并且在进行多类分类研究时，样本数据少的攻击类型的检测率均有提高。验证了该方法的有效性，易于使用。

关键词: 入侵检测, 特征选择, UNSW-NB15, 特征递归消除(RFE), 逻辑回归(LR)

何红艳黄国言张炳陈瑜. 基于多种特征选择策略的入侵检测模型研究[J]. 信息安全研究, 2021, 7(3): 225-232.

References

[1] Al-Hawawreh M, Moustafa N, Sitnikova E. Identification of malicious activities in industrial internet of things based on deep learning models[J]. Journal of Information Security & Applications, 2018, 41(AUG.):1-11 [2] Mubarak A U, Chen ZF, Yan L. Network Intrusion Detection Using Wrapper-based Decision Tree for Feature Selection [J]. 2020 https://arxiv.org/pdf/2008.07405 [3] Eesa A S, Orman Z, Brifcani A M A. A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems [J]. Expert Systems with Applications, 2015, 42(5):2670-2679 [4] Khammassi C, Krichen S. A NSGA2-LR wrapper approach for feature selection in network intrusion detection [J]. Computer Networks, 2020, 172:107183-https://www.sciencedirect.com/science/article/pii/S1389128619315270 [5] Wang Zhen, Tang Mingwei, Deng Jiayu, et al. A New Feature Selection Method for Intrusion Detection [C] //2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS). IEEE, 2020 [6] Hamed T, Dara R, Kremer SC. “Network intrusion detection system based on recursive feature addition and bigram technique,” Computers & Security, vol. 73, Nov. 2017 [7] Ahmadi S S, Rashad S, Elgazzar H. Efficient Feature Selection for Intrusion Detection Systems [C] //2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). IEEE, 2020 [8] 王明,李剑. 基于卷积神经网络的网络入侵检测系统[J].信息安全研究,2017,3(11):990-994 [9] Sarvari S, Sani N F M , Hanapi Z M , et al. AN EFFICIENT ANOMALY INTRUSION DETECTION METHOD WITH FEATURE SELECTION AND EVOLUTIONARY NEURAL NETWORK [J]. IEEE Access, 2020, PP(99):1-1 [10] Nancy P, Muthurajkumar S, Ganapathy S, et al. Intrusion detection using dynamic feature selection and fuzzy temporal decision tree classification for wireless sensor networks [J]. IET Communications, 2020, 14(5):888-895 [11] Zhou Y, Cheng G, Jiang S, et al. Building an Efficient Intrusion Detection System Based on Feature Selection and Ensemble Classifier [J]. Computer Networks, 2020, 174 https://www.sciencedirect.com/science/article/pii/S1389128619314203 [12] Zhong Ying, Chen Wenqi, Wang Zhiliang, et al. HELAD: A novel network anomaly detection model based on heterogeneous ensemble learning,Computer Networks, Volume 169, 2020, 107049, ISSN 1389-1286, https://doi.org/10.1016/j.comnet.2019.107049 [13] Adhao R, Pachghare V. Feature selection using principal component analysis and genetic algorithm [J]. Journal of Discrete Mathematical Sciences and Cryptography, 2020, 23(2):595-602 [14] Mohamed B, Yulia H, Rossitza S. Feature selection using Joint Mutual Information Maximisation [J]. Expert Systems with Applications, 2015, 42(22上):8520-8532 [15] Khammassi C, Krichen S. “A GA-LR Wrapper Approach for Feature Selection in Network Intrusion Detection,” Computers & Security, vol. 70, pp. 255-277, 2018 [16] 莫坤,王娜,李恒吉,等. 基于LightGBM的网络入侵检测系统 [J]. 信息安全研究,2019,5(02):152-156 [17] Liu Y, Liang S, Fang W, et al. A hybrid feature selection algorithm combining information gain and genetic search for intrusion detection [J]. Journal of Physics: Conference Series, 2020, 1601(3):032048 (10pp) [18] H. M. Anwer, M. Farouk and A. Abdel-Hamid, "A framework for efficient network anomaly intrusion detection with features selection," 2018 9th International Conference on Information and Communication Systems (ICICS), Irbid, 2018, pp. 157-162, doi: 10.1109/IACS.2018.8355459 [19] 刘新倩,单纯,任家东,等. 基于流量异常分析多维优化的入侵检测方法 [J]. 信息安全学报,2019,4(01):14-26 [20] Alshboul R, Thabtah F, Abdelhamid N, et al. A visualization cybersecurity method based on features' dissimilarity [J]. Computers & Security, 2018, 77(AUG.):289-303 [21] A J Z, A Y L, C X F B, et al. Model of the intrusion detection system based on the integration of spatial-temporal features [J]. Computers & Security, Volume 89, 2020, 101681 [22] Moustafa N, Slay J. The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set [J]. Information Systems Security, 2016, 25(1-3):18-31 [23] 康健, 王杰, 李正旭, 等. 物联网中一种基于多种特征提取策略的入侵检测模型 [J]. 信息网络安全, 2019(9) [24] Dash M, Liu H. Feature Selection for Classification. In: Intelligent Data Analysis 1 (1997) 131–156 [25] 汪文勇,刘川,赵强,等. 直接验证的封装式特征选择方法 [J]. 电子科技大学学报,2016,45(04):607-615 [26] 陈瑜. 基于组合学习的网络流量异常检测方法的研究 [D]. 燕山大学,2019 [27] Moustafa N, Slay J. The Evaluation of Network Anomaly Detection network anomaly intrusion detection with features selection [C] //2018 9th International Conference on Information and Communication Systems (ICICS). 2018 [28] Nainggolan R, Perangin-Angin R, Simarmata E, et al. Improved the Performance of the K-Means Cluster Using the Sum of Squared Error (SSE) optimized by using the Elbow Method [J]. Journal of Physics: Conference Series, 2019, 1361(1):012015 (6pp) [29] Bakhtiar B Y. PENGEMBANGAN SISTEM ANALISIS AKADEMIS MENGGUNAKAN OLAP DAN DATA CLUSTERING STUDI KASUS: AKADEMIK UNIVERSITAS SEBELAS MARET SURAKARTA [J]. Jurnal Teknologi & Informasi ITSmart, 2016, 4(1):01 [30] 吴广建,章剑林,袁丁. 基于K-means的手肘法自动获取K值方法研究 [J]. 软件,2019,40(05):167-170 [31] 侯莹,陈文胜,王丹宁,等. 基于集成特征选择的网络入侵检测模 [J]. 现代计算机,2020(24):42-45 [32] 朱世松,巴梦龙,王辉,等. 基于NBSR模型的入侵检测技术 [J]. 计算机工程与科学,2020,42(03):427-433