信息安全研究 ›› 2025, Vol. 11 ›› Issue (4): 304-.

• 学术论文 • 上一篇    下一篇

基于深度学习的加密网站指纹识别方法

池亚平1,2彭文龙1徐子涵1陈颖1   

  1. 1(北京电子科技学院网络空间安全系北京100070)
    2(中国科学院网络测评技术重点实验室(中国科学院信息工程研究所)北京100093)
  • 出版日期:2025-04-30 发布日期:2025-04-30
  • 通讯作者: 池亚平 硕士,教授.主要研究方向为网络安全防护、云计算安全. chiyp_besti@163.com
  • 作者简介:池亚平 硕士,教授.主要研究方向为网络安全防护、云计算安全. chiyp_besti@163.com 彭文龙 硕士研究生.主要研究方向为网站指纹识别. 1531740985@qq.com 徐子涵 硕士研究生.主要研究方向为网络威胁情报. 13767067665@163.com 陈颖 博士,教授.主要研究方向为数据挖掘、人工智能和图像处理. ychen@ besti.edu.cn

Deep Learningbased Method for Encrypted Website Fingerprinting

Chi Yaping1,2, Peng Wenlong1, Xu Zihan1, and Chen Ying1   

  1. 1(Department of Network Space Security, Beijing Electronics Science & Technology Institute, Beijing 100070)
    2(Key Laboratory of Network Assessment Technology(Institute of Information Engineering, Chinese Academy of Sciences), Beijing 100093)
  • Online:2025-04-30 Published:2025-04-30

摘要: 网站指纹识别技术是网络安全和隐私保护领域的一个重要研究方向,其目标是通过分析网络流量特征识别出用户在加密的网络环境中访问的网站.针对目前主流方法存在应用场景有限、适用性不足以及特征选取单一等问题,提出了一种基于深度学习的加密网站指纹识别方法.首先,设计了一种新的原始数据包的预处理方法,可以基于直接抓包得到的原始数据包文件得到一个包含空间和时间双特征的具备层次结构的特征序列.然后,设计了一种基于卷积神经网络和长短期记忆网络的融合深度学习模型,充分学习数据中包含的空间和时间特征.在此基础上,进一步探索了不同的激活函数、模型参数和优化算法,以提高模型的识别准确率和泛化能力.实验结果表明,在洋葱匿名网络环境下不依赖其数据单元(cell)时,可展现出更高的网站指纹识别准确率,同时在虚拟私人网络场景下也取得了相较于目前主流机器学习方法更高的准确率.

关键词: 深度学习, 加密流量, 网站指纹识别, 洋葱网络, 虚拟私人网络

Abstract: Website fingerprinting is an important research area within the fields of network security and privacy protection. Its goal is to identify websites accessed by users within an encrypted network environment by analyzing network traffic characteristics. In response to the problems of limited application scenarios, such as restricted application scenarios, insufficient applicability, and the singularity of feature selection, this paper proposes a deep learningbased method for encrypted website fingerprinting. Initially, a new preprocessing method for raw data packets is introduced, which processes directly captured raw packet files to generate a feature sequence with both spatial and temporal characteristics, structured hierarchically. Following this, a hybrid deep learning model combining convolutional neural networks and long shortterm memory networks is designed to thoroughly learn the spatial and temporal features present in the data. The study further investigates various activation functions, model parameters, and optimization algorithms to improve the model’s accuracy and generalization capability. Experimental results indicate that this method provides higher website fingerprinting accuracy in the onion router anonymous network environment when it does not rely on cell packets. And it also achieves better accuracy compared to current mainstream machine learning methods in virtual private network scenarios.

Key words: deep learning, encrypted traffic, website fingerprinting, the onion router, virtual private network

中图分类号: