信息安全研究 ›› 2023, Vol. 9 ›› Issue (1): 22-.

• 学术论文 • 上一篇    下一篇

基于隐变量模型的恶意登录行为在线检测方法

陈雪1彭艳兵2陈前2刘泽正3
  

  1. 1(武汉邮电科学研究院武汉430074)
    2(南京烽火天地通信科技有限公司南京210019)
    3(南京信息工程大学南京210044)
  • 出版日期:2023-01-01 发布日期:2022-12-30
  • 通讯作者: 陈雪 硕士研究生.主要研究方向为网络安全. chenxue16876@163.com
  • 作者简介:陈雪 硕士研究生.主要研究方向为网络安全. chenxue16876@163.com 彭艳兵 博士,教授级高级工程师,硕士生导师.主要研究方向为网络行为分析、海量数据挖掘. 12012016@qq.com 陈前 硕士,工程师.主要研究方向为数据挖掘、机器学习. qianchen929@163.com 刘泽正 主要研究方向为信息安全. 2471206927@qq.com

Online Detection Method of Malicious Login Behavior Based on  Hidden Variable Model

  • Online:2023-01-01 Published:2022-12-30

摘要: 恶意登录行为的分类与检测对运营商监管网络安全具有重要意义,但恶意登录行为的检测技术普遍存在模型运算过程庞大、缺乏实时性、无法高效率处理高维数据等问题.为此,提出一种基于隐变量的恶意登录行为在线检测方法.通过深入解析暴力破解原理,提取流量特征匹配度较高的特征,进行特征向量的构建以实现特征增强,并使用轻量级的最大期望(expectation maximization, EM)算法代替传统复杂的机器学习和深度学习算法实现恶意登录行为流量检测.在此基础上引入基于隐变量机制的EM算法强化模型对关键特征的提取能力,从而提升恶意登录行为的流量检测准确率.在公开数据集CICIDS2017上的实验结果表明,该方法的精确率达到98.7%,误报率低至2.38%;相比多层感知机算法的精确率提高了23.7%,相比基于CDF阈值分割算法的召回率提高12.8%,误报率降低4.19%.

关键词: 恶意登录, 暴力破解, EM算法, 隐变量, 高斯分布

Abstract: The classification and detection of malicious login behavior is of great significance for operators to supervise network security. However, the detection technology of malicious login behavior has many problems, such as huge model operation process, lack of realtime performance, and inability to deal with highdimensional data efficiently. Therefore, it is proposed that an online detection method of malicious login behavior based on hidden variables in this paper. By analyzing the bruteforce cracking principle, the features with high matching degree of traffic features are extracted, and the feature vectors are constructed to achieve feature enhancement. The lightweight expectation maximization (EM) algorithm is used to replace the conventionally complex machine learning and deep learning algorithms to realize the traffic detection of malicious login behavior. On this basis, the EM algorithm based on hidden variable mechanism is introduced to strengthen the extraction ability of key features of the model, so as to improve the traffic detection accuracy of malicious login behavior. Experimental results on the public data set CICIDS2017 show that the accuracy of the proposed method is 98.7%, and the false alarm rate is as low as 2.38%. Compared with the multilayer perceptron algorithm, the accuracy of the detection method is improved by 23.7%, the recall rate is increased by 12.8% and the false alarm rate is reduced by 4.19% compared with the CDFbased threshold segmentation algorithm.

Key words: malicious login, bruteforce cracking, EM algorithm, hidden variables, Gaussian distribution