Journal of Information Security Reserach ›› 2025, Vol. 11 ›› Issue (2): 130-.

Previous Articles     Next Articles

A Malicious TLS Traffic Detection Method with Multimodal Features

Zeng Qingpeng, He Shuming, and Chai Jiangli   

  1. (School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031)
  • Online:2025-02-20 Published:2025-02-20

 融合多模态特征的恶意TLS流量检测方法

曾庆鹏贺述明柴江力   

  1. (南昌大学数学与计算机学院南昌330031)
  • 通讯作者: 曾庆鹏 硕士,副教授、硕士生导师.主要研究方向为网络与信息安全、信息检索与数据挖掘. zengqingpeng@ncu.edu.cn
  • 作者简介:曾庆鹏 硕士,副教授、硕士生导师.主要研究方向为网络与信息安全、信息检索与数据挖掘. zengqingpeng@ncu.edu.cn 贺述明 硕士研究生.主要研究方向为网络与信息安全、深度学习. 905008677@qq.com 柴江力 硕士研究生.主要研究方向为网络与信息安全、联邦学习. 785691921@qq.com

Abstract: The malicious TLS traffic detection aims to identify network traffic that involves malicious activities transmitted through the TLS protocol. Due to the encryption properties of the TLS protocol, traditional textbased traffic analysis methods have limited effectiveness when dealing with encrypted traffic. To address this issue, a malicious TLS traffic detection method called MultiModal Feature Fusion for TLS Traffic Detection (MTBRL) has been proposed. This method extracts and fuses features from different modalities to detect malicious TLS traffic. Firstly, expert knowledge is employed for feature engineering, extracting key features from encrypted traffic, including protocol versions, encryption suites, and certificate information. These features are processed and transformed into twodimensional image representations. Then, ResNet is utilized to encode these images and extract their features. Simultaneously, an encrypted traffic pretrained BERT model is used to encode TLS flows, allowing the learning of contextual and semantic features of the TLS traffic. Additionally, an LSTM model is employed to encode the sequence of packet length distributions of the encrypted traffic, capturing temporal characteristics. Finally, through feature fusion techniques, the different modality features are integrated, and the model’s weight parameters are automatically learned and optimized using the backpropagation algorithm to accurately predict malicious TLS traffic. Experimental results demonstrate that this method achieves accuracy, precision, recall, and F1score of 94.94%, 94.85%, 94.15%, and 94.45%, on the DataCon2020 dataset. This performance is significantly superior to traditional machine learning and deep learning methods. 

Key words: encrypted traffic, network security, intrusion detection, multimodal, deep learning

摘要: 恶意TLS流量检测旨在识别出利用TLS协议传输恶意活动的网络流量,由于TLS协议的加密特性,传统的基于文本特征的流量分析方法在面对加密流量时效果有限.为了解决这个问题,提出了一种融合多模态特征的恶意TLS流量检测方法(MTBRL),该方法从不同模态中提取和融合特征,实现对恶意TLS流量的检测.首先,通过专家经验进行特征工程,从加密流量中提取关键特征,包括协议版本、加密套件和证书信息等,对这些特征进行处理后将其转化为2维图像表示,再利用ResNet对这些图像进行编码,以提取图像的特征.其次,使用加密流量预训练的BERT对TLS流进行编码,从中学习到TLS流的上下文和语义特征.此外,使用LSTM对加密流量的包长度分布序列进行编码,以捕捉时序特征.最后通过特征融合技术整合不同模态特征,利用反向传播算法自动学习并优化模型的权重参数,以准确预测恶意TLS流量.实验结果表明,该方法在DataCon2020数据集上准确率、精确率、召回率和F1值分别达到94.94%,94.85%,94.15%和94.45%,显著优于传统机器学习和深度学习方法.

关键词: 加密流量, 网络安全, 入侵检测, 多模态, 深度学习

CLC Number: