基于隐变量模型的恶意登录行为在线检测方法

信息安全研究 ›› 2023, Vol. 9 ›› Issue (1): 22-.

基于隐变量模型的恶意登录行为在线检测方法

陈雪1彭艳兵2陈前2刘泽正3

1(武汉邮电科学研究院武汉430074)
2(南京烽火天地通信科技有限公司南京210019)
3(南京信息工程大学南京210044)

出版日期:2023-01-01 发布日期:2022-12-30
通讯作者: 陈雪硕士研究生.主要研究方向为网络安全. chenxue16876@163.com
作者简介:陈雪硕士研究生.主要研究方向为网络安全. chenxue16876@163.com 彭艳兵博士，教授级高级工程师，硕士生导师.主要研究方向为网络行为分析、海量数据挖掘. 12012016@qq.com 陈前硕士,工程师.主要研究方向为数据挖掘、机器学习. qianchen929@163.com 刘泽正主要研究方向为信息安全. 2471206927@qq.com

Online Detection Method of Malicious Login Behavior Based on Hidden Variable Model

Online:2023-01-01 Published:2022-12-30

摘要/Abstract

摘要： 恶意登录行为的分类与检测对运营商监管网络安全具有重要意义，但恶意登录行为的检测技术普遍存在模型运算过程庞大、缺乏实时性、无法高效率处理高维数据等问题.为此，提出一种基于隐变量的恶意登录行为在线检测方法.通过深入解析暴力破解原理，提取流量特征匹配度较高的特征，进行特征向量的构建以实现特征增强，并使用轻量级的最大期望(expectation maximization, EM)算法代替传统复杂的机器学习和深度学习算法实现恶意登录行为流量检测.在此基础上引入基于隐变量机制的EM算法强化模型对关键特征的提取能力，从而提升恶意登录行为的流量检测准确率.在公开数据集CICIDS2017上的实验结果表明，该方法的精确率达到98.7%，误报率低至2.38%；相比多层感知机算法的精确率提高了23.7%，相比基于CDF阈值分割算法的召回率提高12.8％，误报率降低4.19％.

关键词: 恶意登录, 暴力破解, EM算法, 隐变量, 高斯分布

Abstract: The classification and detection of malicious login behavior is of great significance for operators to supervise network security. However, the detection technology of malicious login behavior has many problems, such as huge model operation process, lack of realtime performance, and inability to deal with highdimensional data efficiently. Therefore, it is proposed that an online detection method of malicious login behavior based on hidden variables in this paper. By analyzing the bruteforce cracking principle, the features with high matching degree of traffic features are extracted, and the feature vectors are constructed to achieve feature enhancement. The lightweight expectation maximization (EM) algorithm is used to replace the conventionally complex machine learning and deep learning algorithms to realize the traffic detection of malicious login behavior. On this basis, the EM algorithm based on hidden variable mechanism is introduced to strengthen the extraction ability of key features of the model, so as to improve the traffic detection accuracy of malicious login behavior. Experimental results on the public data set CICIDS2017 show that the accuracy of the proposed method is 98.7%, and the false alarm rate is as low as 2.38%. Compared with the multilayer perceptron algorithm, the accuracy of the detection method is improved by 23.7%, the recall rate is increased by 12.8% and the false alarm rate is reduced by 4.19% compared with the CDFbased threshold segmentation algorithm.

Key words: malicious login, bruteforce cracking, EM algorithm, hidden variables, Gaussian distribution

陈雪, 彭艳兵, 陈前, 刘泽正, . 基于隐变量模型的恶意登录行为在线检测方法[J]. 信息安全研究, 2023, 9(1): 22-.

参考文献

参考文献
［1］邓桦, 宋甫元, 付玲, 等. 云计算环境下数据安全与隐私保护研究综述［J］. 湖南大学学报, 2022, 49(4): 110［2］彭祯方, 邢国强, 陈兴跃. 人工智能在网络安全领域的应用及技术综述［J］. 信息安全研究, 2022, 8(2): 110116［3］Spitzner L. Honeypots: Catching the insider threat［C］ Proc of the 19th ANN Computer Security Application Conf. Piscataway, NJ: IEEE, 2003: 170179［4］Legg P A, Buckley O, Goldsmith M, et al. Caught in the act of an insider attack: Detection and assessment of insider threat［C］ Proc of the 2015 IEEE Int Symp Technol Homeland Security. Piscataway, NJ: IEEE, 2015: 16［5］窦曼方. 基于大数据特征的入侵检测模型研究［D］. 兰州: 兰州理工大学, 2021［6］张文金, 覃仲宇, 冯钊, 等. 大数据下的Web暴力破解攻击检测［J］. 深圳大学学报, 2020, 37(Z1): 4449［7］Bansal A, Kaur S. Extreme gradient boosting based tuning for classification in intrusion detection systems［C］ Proc of Int Conf on Advances in Computing and Data Sciences. Berlin: Springer, 2018: 372380［8］Catillo M, Rak M, Villano U. Discovery of DoS attacks by the ZEDIDS anomaly detector［J］. Journal of High Speed Networks, 2019, 25(4): 349365［9］Tama B A, Comuzzi M, Rhee K H. TSEIDS: A twostage classifier ensemble for intelligent anomalybased intrusion detection system［J］. IEEE Access, 2019, 7: 9449794507［10］Aguiar C, Leite D. Unsupervised fuzzy eIX: Evolving internalexternal fuzzy clustering［C］ Proc of the 2020 IEEE Conf on Evolving and Adaptive Intelligent Systems. Piscataway, NJ: IEEE, 2020: 18［11］Dwivedi S, Vardhan M, Tripathi S. Building an efficient intrusion detection system using grasshopper optimization algorithm for anomaly detection［J］. Cluster Computer, 2021, 24(3): 18811900［12］Wazirali R. An improved intrusion detection system based on KNN Hyperparameter tuning and crossvalidation［J］. Arabian Journal for Science and Engineering, 2020, 45(12): 1085910873［13］Dempster A P, Laird N M, Rubin D B. Maximun likelihood estimation from incomplete data［J］. Journal of the Royal Statistical Society, 1977, 39(1): 138［14］王爱平, 张功营, 刘方. EM算法研究与应用［J］. 计算机技术与发展, 2009, 19(9): 108110［15］赵桂儒, 李卫东, 刘典婷, 等. EM算法的改进及其在行为识别中的应用［J］. 电视技术, 2014, 38(13): 196199［16］明泽, 宋文爱, 单纯, 等. 基于深度学习的网络恶意登录异常检测方法研究［J］. 中北大学学报, 2021, 42(4): 325331［17］鲁刚, 郭荣华, 周颖, 等. 恶意流量特征提取综述［C］ 第33次全国计算机安全学术交流会论文集. 北京: 中国计算机学会, 2018: 19［18］魏琴芳, 杨子明, 胡向东, 等. 基于流量特征的登录账号密码暴力破解攻击检测方法［J］. 西南大学学报, 2017, 39(7): 149154［19］陈晓苏, 朱国胜, 肖道举. TCPIP协议族的安全架构［J］. 华中科技大学学报, 2001, 29(3): 1517［20］王方玉, 张建辉, 卜佑军, 等. 基于无监督机器学习的网络流量分类研究综述［J］.信息工程大学学报, 2020, 21(6): 705710［21］Sharafaldin I, Lashkari A H, Ghorbani A A. Toward generating a new intrusion detection dataset and intrusion traffic characterization［C］ Proc of the 4th Int Conf Information Systems Security Privacy. New York: Gordon & Breach Science Publishers, 2018: 108116［22］Attak H, Combalia M, Gardikis G, et al. Application of distributed computing and machine learningtechnologiesto cybersecurity［COL］ Proc of Artificial Intelligence Cybersecurity. 2018 ［20220911］. https:torsec.github.ioshieldh2020documentsscientificpapersCESAR2018_paper.pdf