A Static Tagging Method of Malicious Code Family Based on Multi-Feature

Journal of Information Security Research ›› 2018, Vol. 4 ›› Issue (4): 322-328.

Previous Articles Next Articles

A Static Tagging Method of Malicious Code Family Based on Multi-Feature

Received:2018-04-18 Online:2018-04-15 Published:2018-04-20

一种基于多特征的恶意代码家族静态标注方法

刘亮¹,刘露平²,何帅³, 刘嘉勇⁴

1. 四川大学网络空间学院
2. 四川大学电子信息学院
3. 四川大学
4. 四川大学网络空间安全学院

通讯作者: 刘亮
作者简介:刘亮工学硕士，工程师，主要研究方向为信息系统安全、恶意代码检测。刘露平硕士，博士研究生，主要研究方向为软件与系统安全、二进制程序分析、漏洞挖掘。何帅硕士研究生,主要研究方向为入侵检测、驱动开发。刘嘉勇博士，教授,主要研究方向为网路信息安全、网络信息处理、大数据分析。

Abstract

Abstract: This paper describes a method of static tagging of malicious code family based on multiple features, it uses malicious code visualization technology to draw malicious code image, extracts feature from image source and text source, byte code layer and Operation code layer, it extract features from multiple sources and multi-level which aims at overcoming defects that only extract features from one source. In order to make better use of the features extracted from multiple levels, this paper designs a 3-layer multi-classifier joint framework for feature learning, and the 3-layer multi-classifier joint framework is divided into three parts, which are feature combination layer, classification layer and union layer. Finally, we can use the learning model to tag the malicious code automatically. In order to verify the validity of the method, we made the malicious code family tagging test experiment with 9 kinds of malicious code in Microsoft’s data set, and the experimental results show that our method has higher accuracy, precision, recall and F1-score which are more than 90% in other sample families except SIMDA malicious code family. The validity and reliability of the method are proved by experiments.

Key words: malicious code family, malicious code image, machine learning, multi-feature, Multi-classifier Joint framework

摘要： 本文描述了一种基于多特征的恶意代码家族静态标注方法，该方法针对现有技术提取特征单一的缺点，采用恶意代码可视化技术绘制恶意代码图像，并从图像源和文本源、字节码层和操作码层进行特征的提取，多来源多层次地提取特征。为了更好的利用提取自多个层次的特征，本文设计了3层多分类器联合框架来进行特征的学习，3层多分类器联合框架分为特征组合层、分类层和联合层。最后利用学习到的模型便可以自动进行恶意代码的标注。为了验证方法的有效性，我们在Microsoft提供的9类恶意代码进行恶意代码家族标注测试实验，实验结果表明，我们方法在除了Simda恶意样本家族外，在其他样本家族中在准确率、精确率、召回率和F1-score均高于90%。通过实验证明的该方法的有效性和可靠性。

关键词: 恶意代码家族, 多特征, 恶意代码图像, 机器学习, 多分类器联合框架

刘亮刘露平何帅刘嘉勇. 一种基于多特征的恶意代码家族静态标注方法[J]. 信息安全研究, 2018, 4(4): 322-328.

References

[1]瑞星. 瑞星2016年中国信息安全报告[EB/OL].[2018-03-10] .http://it.rising.com.cn/dongtai/18659.html [2] 互联网安全中心. 360安全报告-2016年中国互联网安全报告[EB/OL]. [2018-03-10] .http://zt.360.cn/1101061855.php?dtid=11011062514&did=490278985 [3] Flake H. Structural comparison of executable objects[J]. DIMVA 2004, July 6-7, Dortmund, Germany, 2004 [4] Dullien T, Rolles R. Graph-based comparison of executable objects (english version)[J]. SSTIC, 2005, 5: 1-3 [5] Kolter J Z, Maloof M A. Learning to detect and classify malicious executables in the wild[J]. Journal of Machine Learning Research, 2006, 7(Dec): 2721-2744 [6] 杨轶, 苏璞睿, 应凌云, 等. 基于行为依赖特征的恶意代码相似性比较方法[J]. 软件学报, 2011, 22(10): 2438-2453 [7] Kancherla K, Mukkamala S. Image visualization based malware detection[C]//Proc of IEEE Symposium on Computational Intelligence in Cyber Security (CICS). Piscataway,NJ:IEEE, 2013: 40-44 [8] 徐小琳, 云晓春, 周勇林, 等. 基于特征聚类的海量恶意代码在线自动分析模型[J]. 通信学报, 2013, 34(8): 146-153 [9] Cesare S, Xiang Y, Zhou W. Malwise—An effective and efficient classification system for packed and polymorphic malware[J]. IEEE Transon Computers, 2013, 62(6): 1193-1206 [10] Oliva A, Torralba A. Building the gist of a scene: The role of global image features in recognition[J]. Progress in Brain Research, 2006, 155: 23-36 [11] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110 [12] Haralick R M, Shanmugam K. Textural features for image classification[J]. IEEE Trans on systems, man, and cybernetics, 1973, 3(6): 610-621 [13] 韩晓光, 姚宣霞, 曲武, 等. 基于图像纹理聚类的恶意代码家族标注方法[J]. 解放军理工大学学报: 自然科学版, 2014, 15(5): 440-449 [14] Ulaby F T, Kouyate F, Brisco B, et al. Textural infornation in SAR images[J]. IEEE Trans on Geoscience and Remote Sensing, 1986 (2): 235-245 [15] Moskovitch R, Feher C, Tzachar N, et al. Unknown malcode detection using opcode representation[J]. Intelligence and Security Informatics, 2008: 204-215 [16] Microsoft Malware Classification[OL].[2018-03-10]. https://www.kaggle.com/c/malware-classificaton

A Static Tagging Method of Malicious Code Family Based on Multi-Feature

一种基于多特征的恶意代码家族静态标注方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles 0

Metrics

[1]	. The Research of Discerning XSS Attack Based on FP-growth Optimized SVM Classifier [J]. Journal of Information Security Research, 2020, 6(9): 0-0.
[2]	. Flow Anomaly Detection Based on Hierarchical Clustering Method [J]. Journal of Information Security Research, 2020, 6(6): 0-0.
[3]	. Survey of Security Situation Prediction Technology Based on Artificial Intelligence [J]. Journal of Information Security Research, 2020, 6(6): 0-0.
[4]	. A Framework for Proactive Acquisition of Threat Intelligence Based on Darknet [J]. Journal of Information Security Research, 2020, 6(2): 131-138.
[5]	. Research on Webshell Detection Method Based on Logistic Regression Algorithm [J]. Journal of Information Security Research, 2019, 5(4): 298-302.
[6]	. Research on Anti-Scanning Technology Based on Machine Learning [J]. Journal of Information Security Research, 2019, 5(4): 303-308.
[7]	. Payment Risk Recognition Model Based on Stable Risk Feature Selection [J]. Journal of Information Security Research, 2019, 5(10): 858-864.
[8]	. Android malicious application detection system based on multidimensional feature [J]. Journal of Information Security Research, 2018, 4(2): 133-139.
[9]	. Design and Implementation of Android Malware Detection System Based on Deep Learning [J]. Journal of Information Security Research, 2018, 4(2): 140-144.
[10]	. Detection Method of Android Malware by Using Permission [J]. Journal of Information Security Research, 2017, 3(9): 817-822.
[11]	. Research on the Sentiment Analysis Model of Product Reviews Based on Machine Learning [J]. Journal of Information Security Research, 2017, 3(2): 166-170.
[12]	. Research on Malware Detection Technology Based on System Call [J]. Journal of Information Security Research, 2016, 2(4): 367-371.
[13]	Hao Chenxi and Fang Yong. PDF File Malicious Code Detection Method Based on Spectrum Analysis [J]. Journal of Information Security Research, 2016, 2(2): 166-171.
[14]	. A Malicious Code Detection Method Based on Data Mining and Machine Learning [J]. Journal of Information Security Research, 2016, 2(1): 74-79.
[15]	. Research on Strategy of Malicious URL MultiLayer Filtering Detection Model [J]. Journal of Information Security Research, 2016, 2(1): 80-85.