基于特征融合的双分支恶意代码同源性分析模型

信息安全研究 ›› 2025, Vol. 11 ›› Issue (7): 594-.

基于特征融合的双分支恶意代码同源性分析模型

刘凤春1,2,3,5,6张志枫1薛涛1,3,5杨光辉1,3,4,6魏群1

1(华北理工大学理学院河北唐山063210)
2(华北理工大学轻工学院河北唐山063210)
3(铁矿石优选与铁前工艺智能化河北省工程研究中心(华北理工大学)河北唐山063210)
4(河北省数据科学与应用重点实验室(华北理工大学)河北唐山063210)
5(唐山市工程计算重点实验室(华北理工大学)河北唐山063210)
6(唐山市智能工业与图像处理技术创新中心(华北理工大学)河北唐山063210)

出版日期:2025-07-29 发布日期:2025-07-29
通讯作者: 刘凤春博士，教授.主要研究方向为网络安全、数据挖掘和机器学习. lnobliu@ncst.edu.cn
作者简介:刘凤春博士，教授.主要研究方向为网络安全、数据挖掘和机器学习. lnobliu@ncst.edu.cn 张志枫硕士.主要研究方向为恶意代码分类. Zhangzf@stu.ncst.edu.cn 薛涛硕士，副教授.主要研究方向为深度学习、多媒体技术和机器学习. xuetao@ncst.edu.cn 杨光辉博士，讲师.主要研究方向为网络与空间安全、数据挖掘和深度学习. yangguanghui@ncst.edu.cn 魏群博士，副教授.主要研究方向为数据库、智能信息检索. ts_weiqun@163.com

Dualbranch Malicious Code Homology Analysis Model Based on Feature Fusion

Liu Fengchun1,2,3,5,6, Zhang Zhifeng1, Xue Tao1,3,5, Yang Guanghui1,3,4,6, and Wei Qun1

1(College of Science, North China University of Science and Technology, Tangshan, Hebei 063210)
2(School of Light Industry, North China University of Science and Technology, Tangshan, Hebei 063210)
3(Hebei Engineering Research Center of Iron Ore Optimization and Intelligent Iron Preprocess(North China University of Science and Technology), Tangshan, Hebei 063210)
4(Hebei Key Laboratory of Data Science and Application(North China University of Science and Technology), Tangshan, Hebei 063210)
5(The Key Laboratory of Engineering Computing in Tangshan City(North China University of Science and Technology), Tangshan, Hebei 063210)
6(Tangshan Intelligent Industry and Image Processing Technology Innovation Center(North China University of Science and Technology), Tangshan, Hebei 063210)

Online:2025-07-29 Published:2025-07-29

摘要/Abstract

摘要： 在恶意代码同源性分析中，由于加密、混淆和加壳等技术产生大量恶意代码变种，导致深度学习模型对恶意代码特征提取能力不足的问题.为此，提出一种多分支卷积和Transformer构建的双分支恶意代码同源性分析模型MCATNet(multibranch convolution and TransformerNet).首先，构建MCATNet双分支网络，一个分支是多分支卷积MBC(multibranch convolution)模块，以MBC模块构建CNN分支，同时引入混合注意力机制，使网络在兼顾局部特征的同时更能关注核心特征；另一个分支是以ViT为主干的Transformer模块，提取恶意代码图像的全局特征信息并提出下采样模块，在精细地保留全局特征的同时使Transformer与CNN的特征图在空间尺度对齐；其次，以级联的策略融合CNN分支的局部特征和Transformer分支的全局特征，解决网络只关注单一特征问题；最后，使用Softmax分类器对恶意代码家族进行同源性分析.实验结果表明，基于特征融合的双分支模型的分类准确率达到99.24%，相比单支CNN和单支Transformer模型，准确率分别提高0.11%和0.65%.

关键词: 双分支, 特征融合, 多分支卷积, 注意力机制, 下采样

Abstract: In the homology analysis of malicious code, a large number of malicious code variants are generated due to techniques such as encryption, obfuscation, and packing, which leads to the problem that the deep learning model has insufficient ability to extract the features of malicious code. To solve this problem, a multibranch convolution and transformernet (MCATNet) homology analysis model based on feature fusion was proposed. Firstly, an MCATNet dualbranch network was constructed, one branch was a multibranch convolutional MBC (Multibranch convolution) module, and the MBC module was used to construct the CNN branch, and the CBAM hybrid attention mechanism was introduced to make the network pay more attention to the core features while taking into account the local features. Another branch is the Transformer module with ViT as the backbone, which extracts global feature information of malicious code images and proposes a downsampling module to finely preserve global features while aligning the feature maps of Transformer and CNN at the spatial scale. Secondly, the cascading strategy is used to fuse the local features of the CNN branch and the global features of the Transformer branch to solve the problem that the network only focuses on a single feature. Finally, the Softmax classifier was used to analyze the homology of the malicious code family. Experimental results show that the classification accuracy of the twobranch model based on feature fusion reaches 99.24%, which is 0.11% and 0.65% higher than that of the singlebranch CNN and singlebranch Transformer models, respectively.

Key words: doublebranched, feature fusion, multibranch convolution, attention mechanism, downsampling

中图分类号:

TP309

刘凤春, 张志枫, 薛涛, 杨光辉, 魏群, . 基于特征融合的双分支恶意代码同源性分析模型[J]. 信息安全研究, 2025, 11(7): 594-.

参考文献

［1］Gibert D, Mateu C, Planes J. HYDRA: A multimodal deep learning framework for malware classification［J］. Computers & Security, 2020, 95: 101873［2］Tyagi S,Baghela A, Dar K M, et al. Malware detection in PE files using machine learning［C］ Proc of 2022 OPJU Int Technology Conf on Emerging Technologies for Sustainable Development (OTCON). Piscataway, NJ: IEEE, 2023: 16［3］王兴凤, 黄琨茗, 张文杰. 基于API序列和卷积神经网络的恶意代码检测［J］. 信息安全研究, 2020, 6(3): 212219［4］杨频, 朱悦, 张磊. 基于属性数据流图的恶意代码家族分类［J］. 信息安全研究, 2020, 6(3): 228234［5］Nataraj L, Karthikeyan S, Jacob G, et al. Malware images: Visualization and automatic classification［C］ Proc of the 8th Int Symp on Visualization for Cyber Security. New York: ACM, 2011: 17［6］Kumar S, Janet B. DTMIC: Deep transfer learning for malware image classification［J］. Journal of Information Security and Applications, 2022, 64: 103063［7］Vasan D, Alazab M, Wassan S, et al. IMCFN: Imagebased malware classification using finetuned convolutional neural network architecture［J］. Computer Networks, 2020, 171: 107138［8］Venkatraman S, Alazab M, Vinayakumar R. A hybrid deep learning imagebased analysis for effective malware detection［J］. Journal of Information Security and Applications, 2019, 47: 377389［9］Go J H, Jan T, Mohanty M, et al. Visualization approach for malware classification with resnext［C］ Proc of 2020 IEEE Congress on Evolutionary Computation (CEC). Piscataway, NJ: IEEE, 2020: 17［10］Aslan , Yilmaz A A. A new malware classification framework based on deep learning algorithms［J］. IEEE Access, 2021, 9: 8793687951［11］Wang C, Zhao Z, Wang F, et al. A novel malware detection and family classification scheme for IoT based on DEAM and DenseNet［J］. Security and Communication Networks, 2021 (1): 6658842［12］Awan M J, Masood O A, Mohammed M A, et al. Imagebased malware classification using VGG19 network and spatial convolutional attention［J］. Electronics, 2021, 10(19): 2444［13］Taneja P S, Gopal S, Yadav P, et al. Malware family categorization using genetic algorithmCNNbased image classification technique［J］. ICT with Intelligent Applications, 2022, 1: 199209［14］孙敏, 成倩, 丁希宁. 基于CBAMCGRUSVM的Android恶意软件检测方法［J］. 计算机应用, 2024, 44(5): 15391545［15］Yuan B, Wang J, Liu D, et al. Bytelevel malware classification based on Markov images and deep learning［J］. Computers & Security, 2020, 92: 101740

[1]	蔡松睿, 张仕斌, 丁润宇, 卢嘉中, 黄源源, . 基于跨模态注意力机制和弱监督式对比学习的虚假新闻检测模型[J]. 信息安全研究, 2025, 11(8): 693-.
[2]	陆兴烨, 黄晓芳, 殷明勇, . 基于图神经网络的内部威胁行为检测模型[J]. 信息安全研究, 2025, 11(7): 586-.
[3]	邱雨蝶, 汤艳君, 戴熙来, 王子晨, . 基于ResGCN的比特币混币交易地址识别研究[J]. 信息安全研究, 2025, 11(7): 603-.
[4]	师智斌, 孙文琦, 窦建民, 于孟洋, . 基于词嵌入和特征融合的恶意软件检测研究[J]. 信息安全研究, 2025, 11(5): 412-.
[5]	胡原平, 阎红灿, 刘盈, . 基于多方向混合滤波器的轻量化图像隐写分析模型[J]. 信息安全研究, 2025, 11(4): 318-.
[6]	付安棋, 李剑, . 基于行为聚类的LSTMNN模型恶意行为检测方法[J]. 信息安全研究, 2025, 11(4): 343-.
[7]	刘连山, 黄瑜, . 基于三通道深度融合技术的图像隐写方法[J]. 信息安全研究, 2025, 11(3): 257-.
[8]	李聪聪, 袁子龙, 滕桂法, . 基于深度学习的时空特征融合网络入侵检测模型研究[J]. 信息安全研究, 2025, 11(2): 122-.
[9]	李猛坤, 李柯锦, 王琪, 袁晨, 吕慧颖, 应作斌, . 面向社交网络平台的多模态网络欺凌检测模型研究[J]. 信息安全研究, 2025, 11(2): 154-.
[10]	屈梦楠, 靳宇浩, 张光华, . 基于位图表征与UAtt分类网络的恶意软件识别技术[J]. 信息安全研究, 2025, 11(1): 28-.
[11]	李为, 袁泽坤, 吴克河, 程瑞, . 基于注意力机制和多尺度卷积神经网络的容器异常检测[J]. 信息安全研究, 2025, 11(1): 35-.
[12]	刘文龙, 文斌, 马梦帅, 杜宛蓉, 魏晓寻, . 多种深度学习融合的网络流量异常检测模型[J]. 信息安全研究, 2024, 10(E2): 54-.
[13]	陈先意, 周浩, 刘腾骏, 闫雷鸣, . 基于注意力机制和护照层嵌入的图像处理模型水印方法[J]. 信息安全研究, 2024, 10(9): 849-.
[14]	文津, 蒋凯元, 韩禹洋, 王志强, 罗乐琦, 田文亮, . 基于Transformer与图卷积网络的行为冲突检测模型[J]. 信息安全研究, 2024, 10(8): 729-.
[15]	钟家豪, 张新有, 冯力, 邢焕来, . 基于卷积注意力机制的恶意软件样本增强方案[J]. 信息安全研究, 2024, 10(5): 431-.