信息安全研究 ›› 2024, Vol. 10 ›› Issue (5): 440-.

• 学术论文 • 上一篇    下一篇

基于小样本学习的源码漏洞检测

陈洪森1方勇1郝城凌1杨运涛1张棋2


  

  1. 1(四川大学网络空间安全学院成都610207)
    2(成都市互联网信息中心成都610041)

  • 出版日期:2024-05-20 发布日期:2024-05-20
  • 通讯作者: 张棋 硕士.主要研究方向为网络数据安全政策、数据安全管理. sczhangxqi@126.com
  • 作者简介:陈洪森 硕士.主要研究方向为漏洞检测. modengxian@protonmail.com 方勇 博士,教授,博士生导师.主要研究方向为网络对抗技术. yfang@scu.edu.cn 郝城凌 硕士.主要研究方向为入侵检测、图神经网络. 1612170458@qq.com 杨运涛 硕士.主要研究方向为图神经网络、APT溯源检测. ttmonica111@163.com 张棋 硕士.主要研究方向为网络数据安全政策、数据安全管理. sczhangxqi@126.com

Source Code Vulnerability Detection Based on Fewshot Learning#br#
#br#

Chen Hongsen1, Fang Yong1, Hao Chengling1, Yang Yuntao1, and Zhang Qi2#br#

#br#
  

  1. 1(School of Cyber Science and Engineering, Sichuan University, Chengdu 610207)
    2(Chengdu Internet Information Center, Chengdu 610041)

  • Online:2024-05-20 Published:2024-05-20

摘要: 源码漏洞检测是发现及定位关键系统威胁的重要手段.目前,将深度学习技术应用于源码漏洞检测已经成为研究热点.然而,由于源码漏洞样本缺失,有限的数据条件资源导致现有的源码漏洞检测方法在小样本场景下效果不佳.提出了一种基于小样本学习的源码漏洞检测方法,其目标在于为有限样本量的源码漏洞检测场景提供解决方案.该方法由4个关键部分组成:源码切片和编码、基于元学习的数据集处理、基于动态路由算法的漏洞类向量生成和基于神经张量网络的漏洞类向量匹配.该方法和卷积神经网络、原型网络、关系网络进行了对比,实验结果表明,该方法在准确率方面优于其他的方法,可以有效应对源码漏洞样本稀疏问题.在2way 5shot和2way 10shot的情况下,该方法分别达到93.92%和95.08%的准确率.

关键词: 小样本学习, 漏洞检测, 归纳网络, 代码切片, 元学习

Abstract: Source code vulnerability detection is an important means to discover and localize threats to critical systems. At present, the application of deep learning techniques to source generation vulnerability detection has become a research hotspot. However, due to the lack of source code vulnerability samples, limited data condition resources lead to the poor effect of existing source code vulnerability detection methods in small sample scenarios. In this paper, we propose a source code vulnerability detection method based on fewshot learning, which aims to provide a solution for source code vulnerability detection scenarios with limited sample size. The method in this paper consists of four key components: source code slicing and encoding, metalearning based dataset processing, vulnerability class vector generation based on dynamic routing algorithms, and vulnerability class vector matching based on neural tensor networks. This paper’s method is compared with convolutional neural network, prototype network, and relational network, and the experimental results show that this paper’s method outperforms the others in terms of accuracy, and can effectively cope with the problem of sparse vulnerability samples in source code. In the case of 2way 5shot and 2way 10shot, this paper’s method achieves 93.92% and 95.08% accuracy, respectively.

Key words: fewshot learning, vulnerability detection, induction network, code slicing, metalearning

中图分类号: