Journal of Information Security Reserach ›› 2025, Vol. 11 ›› Issue (10): 924-.

Previous Articles     Next Articles

Robust Malicious Encrypted Traffic Detection Method Based on  Dual Confidence Sample Selection

Wang Yitong, Wu Lifa, and Zhang Bolei
  

  1. (School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023)
  • Online:2025-10-15 Published:2025-10-17

基于双置信度样本选择的鲁棒恶意加密流量检测方法

王一彤吴礼发张伯雷
  

  1. (南京邮电大学计算机学院、软件学院、网络空间安全学院南京210023)
  • 通讯作者: 吴礼发 博士,教授,博士生导师.主要研究方向为网络安全与软件安全. wulifa@njupt.edu.cn
  • 作者简介:王一彤 硕士研究生.主要研究方向为网络安全. 1222046003@njupt.edu.cn 吴礼发 博士,教授,博士生导师.主要研究方向为网络安全与软件安全. wulifa@njupt.edu.cn 张伯雷 博士,副教授.主要研究方向为强化学习与数据挖掘. bolei.zhang@njupt.edu.cn

Abstract: In the task of detecting malicious encrypted traffic, the existence of noise tags seriously affects the generalization ability and detection accuracy of the model. To solve the above problems, a noise label learning method based on DCASS (dualconfidence adaptive sample selection) is proposed to realize robust malicious encryption traffic detection. Firstly, the low dimensional features of samples are extracted by self encoder, and the feature confidence of samples is constructed.Then, the label confidence of samples is evaluated according to their performance in classification training. Finally, an adaptive selection threshold is proposed to select samples based on the dual confidence of feature space and label space, and filter noise samples dynamically to improve the robustness of the model. Experiments on CIRACICDoHBrw2020 dataset show that the proposed method has good performance and stability in dealing with noise labels. The F1 scores of the method reach 86.686%, 86.749%, 83.199% respectively when the noise rate is 20%, 30%, 40%. Compared with the existing three methods, the method proposed in this paper shows the best performance under different noise rates, with the average performance improvement of 18.89%, 37.34%, 6.32% respectively.

Key words: noise label learning, malicious encrypted traffic detection, sample selection, deep learning, autoencoder

摘要: 在恶意加密流量的检测任务中,噪声标签的存在严重影响了模型的泛化能力和检测精度.针对以上问题,提出一种基于双置信度自适应样本选择(dualconfidence adaptive sample selection, DCASS)的噪声标签学习方法,以实现鲁棒恶意加密流量检测.首先通过自编码器提取样本低维特征,并构建样本的特征置信度;然后根据样本在分类训练中的表现评估样本的标签置信度;最后,提出自适应选择阈值,基于特征空间和标签空间的双置信度进行样本选择,动态过滤噪声样本以提升模型鲁棒性.在CIRACICDoHBrw2020数据集上的实验表明,该方法在应对噪声标签时具有良好的性能及稳定性,在噪声率为20%,30%,40%的情况下,该方法的F1分数分别达到86.686%,86.749%,83.199%.与现有的3种方法相比,该方法在不同的噪声率下均表现出最优的性能,平均性能提升分别达到18.89%,37.34%,6.32%.

关键词: 噪声标签学习, 恶意加密流量检测, 样本选择, 深度学习, 自编码器

CLC Number: