Robust Malicious Encrypted Traffic Detection Method Based on  Dual Confidence Sample Selection

Journal of Information Security Reserach ›› 2025, Vol. 11 ›› Issue (10): 924-.

Robust Malicious Encrypted Traffic Detection Method Based on Dual Confidence Sample Selection

Wang Yitong, Wu Lifa, and Zhang Bolei

(School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023)

Online:2025-10-15 Published:2025-10-17

基于双置信度样本选择的鲁棒恶意加密流量检测方法

王一彤吴礼发张伯雷

(南京邮电大学计算机学院、软件学院、网络空间安全学院南京210023)

通讯作者: 吴礼发博士，教授，博士生导师.主要研究方向为网络安全与软件安全. wulifa@njupt.edu.cn
作者简介:王一彤硕士研究生.主要研究方向为网络安全. 1222046003@njupt.edu.cn 吴礼发博士，教授，博士生导师.主要研究方向为网络安全与软件安全. wulifa@njupt.edu.cn 张伯雷博士，副教授.主要研究方向为强化学习与数据挖掘. bolei.zhang@njupt.edu.cn

Abstract

Abstract: In the task of detecting malicious encrypted traffic, the existence of noise tags seriously affects the generalization ability and detection accuracy of the model. To solve the above problems, a noise label learning method based on DCASS (dualconfidence adaptive sample selection) is proposed to realize robust malicious encryption traffic detection. Firstly, the low dimensional features of samples are extracted by self encoder, and the feature confidence of samples is constructed.Then, the label confidence of samples is evaluated according to their performance in classification training. Finally, an adaptive selection threshold is proposed to select samples based on the dual confidence of feature space and label space, and filter noise samples dynamically to improve the robustness of the model. Experiments on CIRACICDoHBrw2020 dataset show that the proposed method has good performance and stability in dealing with noise labels. The F1 scores of the method reach 86.686%, 86.749%, 83.199% respectively when the noise rate is 20%, 30%, 40%. Compared with the existing three methods, the method proposed in this paper shows the best performance under different noise rates, with the average performance improvement of 18.89%, 37.34%, 6.32% respectively.

Key words: noise label learning, malicious encrypted traffic detection, sample selection, deep learning, autoencoder

摘要： 在恶意加密流量的检测任务中，噪声标签的存在严重影响了模型的泛化能力和检测精度.针对以上问题，提出一种基于双置信度自适应样本选择(dualconfidence adaptive sample selection, DCASS)的噪声标签学习方法，以实现鲁棒恶意加密流量检测.首先通过自编码器提取样本低维特征，并构建样本的特征置信度；然后根据样本在分类训练中的表现评估样本的标签置信度；最后，提出自适应选择阈值，基于特征空间和标签空间的双置信度进行样本选择，动态过滤噪声样本以提升模型鲁棒性.在CIRACICDoHBrw2020数据集上的实验表明，该方法在应对噪声标签时具有良好的性能及稳定性，在噪声率为20%，30%，40%的情况下，该方法的F1分数分别达到86.686%，86.749%，83.199%.与现有的3种方法相比，该方法在不同的噪声率下均表现出最优的性能，平均性能提升分别达到18.89%，37.34%，6.32%.

关键词: 噪声标签学习, 恶意加密流量检测, 样本选择, 深度学习, 自编码器

CLC Number:

TP393.08

王一彤, 吴礼发, 张伯雷, . 基于双置信度样本选择的鲁棒恶意加密流量检测方法[J]. 信息安全研究, 2025, 11(10): 924-.

References

［1］Google. Google transparency report［EBOL］. ［20240908］. https:transparency report.google.comhttpsoverview［2］WatchGuard. WatchGuard’s threat lab analyzes the latest malware and internet attacks［EBOL］. ［20240908］. https:www.watchguard.comwgrdresourcecentersecurityreportq22024［3］张稣荣, 卜佑军, 陈博，等. 基于多层双向SRU与注意力模型的加密流量分类方法［J］. 计算机工程, 2022, 48(11): 127136［4］Liu C, He L, Xiong G, et al. FSNet: A flow sequence network for encrypted traffic classification［C］ Proc of IEEE Conf on Computer Communications. Piscataway, NJ: IEEE, 2019: 11711179［5］邓昕, 刘朝晖, 欧阳燕, 等. 基于CNN CBAMBiGRU Attention的加密恶意流量识别［J］. 计算机工程, 2023, 49(11): 178186［6］Song H, Kim M, Park D, et al. Learning from noisy labels with deep neural networks: A survey［J］. IEEE Trans on Neural Networks and Learning Systems, 2022, 34(11): 81358153［7］童家铖, 陈伟, 倪嘉翼, 等. 面向加密恶意流量的噪声标签检测方法［J］. 信息安全研究, 2023, 9(10): 10231027［8］Yuan Q, Zhu Y, Xiong G, et al. ULDC: Unsupervised learningbased data cleaning for malicious traffic with high noise［J］. The Computer Journal, 2024, 67(3): 976987［9］Qing Y, Yin Q, Deng X, et al. Lowquality training data only? A robust framework for detecting encrypted malicious network traffic［J］. arXiv preprint, arXiv:2309.04798, 2023［10］Goldberger J, BenReuven E. Training deep neuralnetworks using a noise adaptation layer［C］ Proc of the 5th Int Conf on Learning Representations. Virtual: OpenReview.net, 2016: 19［11］Lee K, Yun S, Lee K, et al. Robust inference via generative classifiers for handling noisy labels［C］ Proc of the 36th Int Conf on Machine Learning. Cambridge, MA: JMLR, 2019: 37633772［12］Ma X, Huang H, Wang Y, et al. Normalized loss functions for deep learning with noisy labels［C］ Proc of the 37th Int Conf on Machine Learning. Cambridge, MA: JMLR, 2020: 65436553［13］Liu Y, Guo H. Peer loss functions: Learning from noisy labels without knowing noise rates［C］ Proc of the 37th Int Conf on Machine Learning. Cambridge, MA: JMLR, 2020: 62266236［14］Xia X, Liu T, Han B, et al. Robust earlylearning: Hindering the memorization of noisy labels［C］ Proc of the 9th Int Conf on Learning Representations. Virtual: OpenReview.net, 2021: 19［15］Wang J, Wang E X, Liu Y. Estimatinginstancedependent labelnoise transition matrix using a deep neural network［J］. arXiv preprint, arXiv: 2105.13001, 2021［16］Rui X, Cao X, Xie Q, et al. Learning an explicit weighting scheme for adapting complex HSI noise［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2021: 67396748［17］Han B, Yao Q, Yu X, et al. Coteaching: Robust training of deep neural networks with extremely noisy labels［C］ Proc of the 32nd Neural Information Processing Systems. Cambridge, MA: MIT Press, 2018: 85368546［18］Wang Y, Sun X, Fu Y. Scalable penalized regression for noise detection in learning with noisy labels［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 346355［19］Karim N, Rizve M N, Rahnavard N, et al. Unicon: Combating label noise through uniform selection and contrastive learning［C］ Proc of the IEEECVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 96769686［20］Patel D, Sastry P S. Adaptive sample selection for robust learning under label noise［C］ Proc of the IEEECVF Winter Conf on Applications of Computer Vision. Piscataway, NJ: IEEE, 2023: 39323942［21］MontazeriShatoori M, Davidson L, Kaur G, et al. Detection of doh tunnels using timeseries classification of encrypted traffic［C］ Proc of the 5th IEEE Cyber Science and Technology Congress. Piscataway, NJ: IEEE, 2020: 6370［22］Xu J, Li Y, Deng R H. Differential training: A generic framework to reduce label noises for android malware detection［C］ Proc of Network and Distributed System Security Symposium. Rosten, VA: Internet Society, 2021: 114

[1]	. Internet of Things Intrusion Detection Model Based on Federated Learning [J]. Journal of Information Security Reserach, 2025, 11(9): 788-.
[2]	. Fake News Detection Model Based on Crossmodal Attention Mechanism and#br# Weaksupervised Contrastive Learning#br# [J]. Journal of Information Security Reserach, 2025, 11(8): 693-.
[3]	. Encrypted Traffic Detection Method Based on Knowledge Distillation [J]. Journal of Information Security Reserach, 2025, 11(8): 702-.
[4]	. Deep Learningbased Method for Encrypted Website Fingerprinting [J]. Journal of Information Security Reserach, 2025, 11(4): 304-.
[5]	. Research on Dataenhanced Multimodal False Information #br# Detection Framework#br# [J]. Journal of Information Security Reserach, 2025, 11(4): 377-.
[6]	. Privacypreserving Federated Learning Research Based on #br# Confused Modulo Projection Homomorphic Encryption#br# [J]. Journal of Information Security Reserach, 2025, 11(3): 198-.
[7]	. Design of Adversarial Attack Scheme Based on YOLOv8 Object Detector [J]. Journal of Information Security Reserach, 2025, 11(3): 221-.
[8]	. Fake Face Detection Method Based on ConvNeXt [J]. Journal of Information Security Reserach, 2025, 11(3): 231-.
[9]	. Research on Deep Learningbased Spatiotemporal Feature Fusion Network Intrusion Detection Model [J]. Journal of Information Security Reserach, 2025, 11(2): 122-.
[10]	. A Malicious TLS Traffic Detection Method with Multimodal Features [J]. Journal of Information Security Reserach, 2025, 11(2): 130-.
[11]	. Research of Invisible Backdoor Attack Based on Interpretability [J]. Journal of Information Security Reserach, 2025, 11(1): 21-.
[12]	. Container Anomaly Detection Based on Attention Mechanism and Multiscale Convolutional Neural Network [J]. Journal of Information Security Reserach, 2025, 11(1): 35-.
[13]	. Encrypted Traffic Detection Technology for Multisession Coordinated #br# Attack Based on Deep Learning#br# [J]. Journal of Information Security Reserach, 2025, 11(1): 66-.
[14]	. Image Processing Model Watermarking Method Based on #br# Attention Mechanism and Passport Layer Embedding#br# [J]. Journal of Information Security Reserach, 2024, 10(9): 849-.
[15]	. A Review of GPU Acceleration Technology for Deep Learning in Plaintext and Private Computing Environments [J]. Journal of Information Security Reserach, 2024, 10(7): 586-.