信息安全研究 ›› 2023, Vol. 9 ›› Issue (8): 739-.

• 学术论文 • 上一篇    下一篇

基于机器学习的入侵检测模型对比研究

张鹏飞
  

  1. 上海交通大学网络空间安全学院
  • 出版日期:2023-08-01 发布日期:2023-09-04
  • 通讯作者: 张鹏飞 硕士研究生.主要研究方向为信息安全、机器学习. 837692980@163.com
  • 作者简介:张鹏飞 硕士研究生.主要研究方向为信息安全、机器学习. 837692980@163.com

Comparison Research on Intrusion Detection Model Based on  Machine Learning

  • Online:2023-08-01 Published:2023-09-04

摘要: 如今网络威胁不断衍变、隐蔽性越来越强,研究多种机器学习模型在现代流量数据上的入侵检测性能与特性,对提升入侵检测系统的时效性有较大意义.探索采用近些年高效机器学习模型,包括集成学习(如随机森林、LightGBM、XGBoost)与深度学习(如卷积、GRU、LSTM等)模型在公开数据集UNSWNB15上进行入侵检测任务.详细阐述任务流程与实验配置,对比分析不同模型评估指标,得出各模型在入侵检测任务中的特性.实践表明,在10%抽样数据集下,实验模型中二分类任务性能效率最优模型为LightGBM,F1分数为0.897,准确率为89.86%,训练时间为1.98s,预测时间为0.11s;实验中多分类任务最全面的检测模型为XGBoost,F1分数为0.7907,准确率为75.96%,训练时间为144.79s,预测时间为0.21s.

关键词: 入侵检测, 机器学习, 集成学习, 深度学习, 二分类, 多分类, UNSWNB15

Abstract: Nowadays, network threats are constantly evolving and demonstrate increasing invisibility. Studying the performance and characteristics of multiple machine learning models for intrusion detection on modern traffic data is of greater significance to improve the timeliness of intrusion detection systems. This paper explores the use of recent efficient machine learning models, including ensemble learning(Random Forest, XGBoost, LightGBM) and deep learning(CNN, LSTM, GRU, etc) models for intrusion detection tasks on the public dataset UNSWNB15.We elaborate the task flow and experimental configuration, compare and analyze the experimental results of different models, summarize the characteristics of each model in the network intrusion detection task. The experimental results demonstrate that, under a 10% sampled dataset of UNSWNB15, the bestperforming model for the binary classification task among the experimental models is LightGBM, with an F1 score of 0.897, an accuracy of 89.86%, a training time of 1.98s, and a prediction time of 0.11s. In the case of multiclassification tasks, the most comprehensive prediction model among the experimental models is XGBoost, with an overall F1 score of 0.7907, an accuracy of 75.96%, a training time of 144.79s, and a prediction time of 0.21s.

Key words: intrusion detection, machine learning, ensemble learning, deep learning, binary classification, multiclass classification, UNSWNB15