信息安全研究 ›› 2023, Vol. 9 ›› Issue (10): 1023-.

• 技术应用 • 上一篇    下一篇

面向加密恶意流量的噪声标签检测方法

童家铖陈伟倪嘉翼李频   

  1. (南京邮电大学计算机学院、软件学院、网络空间安全学院南京210023)
  • 出版日期:2023-10-17 发布日期:2023-10-28
  • 通讯作者: 陈伟 博士,教授.主要研究方向为无线网络安全、移动互联网安全等. chenwei@njupt.edu.cn
  • 作者简介:童家铖 硕士研究生.主要研究方向为网络安全、加密恶意流量检测. Oc34nus@outlook.com 陈伟 博士,教授.主要研究方向为无线网络安全、移动互联网安全等. chenwei@njupt.edu.cn 倪嘉翼 硕士研究生.主要研究方向为网络安全、网络入侵检测. njiay@outlook.com 李频 副教授.主要研究方向为网络与信息安全. lipin7421@163.com

A Noisy Label Detection Method for Encrypting Malicious Traffic

Tong Jiacheng, Chen Wei, Ni Jiayi, and Li Pin   

  1. (School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023)
  • Online:2023-10-17 Published:2023-10-28

摘要: 对于基于数据驱动的加密恶意流量检测模型的训练及其评估,处理有噪声的数据集仍然是一项挑战,提出了一种基于KRPDDT的噪声标签检测方法,使用差分训练的思想同时训练2个相同的模型,提取样本在2个模型中训练的损失,根据干净样本和噪声样本在训练行为上的差异性检测出噪声样本.同时,为了放大样本间损失上的差异,提出了基于KLIEPRPD的相对噪声权重估计方法,估计每个样本的相对概率密度,并把它作为样本损失行为的权重.该方法在对CICDoHBrw2020数据集清洗过后,有效地恢复了恶意DoH流量检测模型的性能,实验验证了该方法具有不错的稳定性,并超过了其他几种噪声检测方法.

关键词: 噪声标签监测, 噪声权重, 加密恶意流量, DoH流量, 差分训练

Abstract: Processing noisy datasets remains a challenge for training and evaluating data driven encrypted malicious traffic detection models. A noise label detection method based on KRPDDT was proposed, which used differential training to train two identical models simultaneously, extracted the training losses of samples in the two models, and detected noise samples based on the differences in training behavior between clean samples and noise samples. At the same time, in order to amplify the difference in loss between samples, a relative noise weight estimation method based on KLIEPRPD was proposed to estimate the relative probability density of each sample and used it as the weight of the sample loss behavior. This method effectively recovered the performance of the malicious DoH traffic detection model after cleaning the CICDoHBrw2020 dataset. Experiments verified that this method had good stability and outperformed other noise detection methods.

Key words: noisy label detection, noise weight, encrypt malicious traffic, DoH traffic, differential training

中图分类号: