A Trust Framework for Large Language Model Application

Journal of Information Security Reserach ›› 2024, Vol. 10 ›› Issue (12): 1153-.

Previous Articles Next Articles

A Trust Framework for Large Language Model Application

Wei Tao1, Liu Yan1, Weng Haiqin1, Zhong Zhenyu1, Zhu Zetao1, Wang Yu1, and Wang Meiqin2

1(Department of Security and Trust Division, Ant Group, Hangzhou 310023)
2(School of Cyber Science and Technology, Shandong University, Qingdao, Shandong 266237)

Online:2024-12-25 Published:2024-12-30

大模型应用可信框架研究

韦韬1刘焱1翁海琴1仲震宇1朱泽韬1王宇1王美琴2

1(蚂蚁集团安全可信事业部杭州310023)
2(山东大学网络空间安全学院山东青岛266237)

通讯作者: 韦韬博士.主要研究方向为复杂系统和主流操作系统的安全性、可靠性，安全软件开源发展. lenx.wei@antgroup.com
作者简介:韦韬博士.主要研究方向为复杂系统和主流操作系统的安全性、可靠性，安全软件开源发展. lenx.wei@antgroup.com 刘焱主要研究方向为数据安全和AI安全. bencao.ly@antgroup.com 翁海琴博士.主要研究方向为异常检测、AI安全和AI对抗智能算法. haiqin.wenghaiqin@antgroup.com 仲震宇博士.主要研究方向为AI对抗智能、大模型在垂直安全领域的可信安全. edward.zhong@antgroup.com 朱泽韬主要研究方向为安全攻防与安全大模型. zetao.zzt@antgroup.com 王宇硕士.主要研究方向为基础设施与办公网安全、应用安全、流量安全、移动与IOT安全、威胁情报、对抗智能及高等级安全攻防. dy.xiaoyu@antgroup.com 王美琴博士，教授.主要研究方向为对称密码分析与设计. mqwang@sdu.edu.cn

Abstract

Abstract: The emergence of large language model has greatly propelled the rapid application of artificial intelligence across various domains. In practice, however, there are a series of security and trust challenges in the applications of large language models caused by “model hallucinations”. These challenges make it difficult for practical applications to trust and adopt the results returned by the large language models, especially in securityrelated application domains. In many professional fields, we find that there lacks a unified technical framework to ensure the trustworthiness of results returned by large language models, which seriously hinders the application of largescale model technology in professional fields. To address this issue, a largescale model trusted application framework DKCF, integrating sufficient data (D), expertise knowledge (K), intellectual collaboration (C), and efficient feedback (F), is proposed. This framework is developed based on our practical applications in professional fields such as finance, healthcare, and security. We believe that DKCF can shed light on secure and reliable applications of large language models, and facilitate the intellectual revolution across various professional domains.

Key words: DKCF, large language model, trustworthiness, trust framework, cyber security

摘要： 大模型技术的出现极大推动了人工智能在各行各业的快速应用，但大模型在实际应用过程中面临着一系列由“模型幻觉”导致的安全可信挑战.这些挑战导致大模型应用落地时，尤其是安全攸关的专业性推理和研判时，其给出的结果难以被轻易信任采纳.在诸多专业领域实践中发现，大模型应用过程中缺乏一个统一的技术框架保证其结果的可信，严重阻碍了大模型技术在专业领域的应用. 针对这个问题，结合在金融、医疗、安全等专业领域的应用实践，提出一种集充足数据(data，D)、专业知识(knowledge，K)、能力协同(collaboration，C)、高效反馈(feedback，F)为一体的大模型可信应用框架DKCF. 希望DKCF可以推动行业大模型安全可信，助力大模型应用变革，推动越来越多的行业迎来革命.

关键词: DKCF, 大模型, 可信, 可信框架, 网络安全

CLC Number:

TP393.0

韦韬, 刘焱, 翁海琴, 仲震宇, 朱泽韬, 王宇, 王美琴, . 大模型应用可信框架研究[J]. 信息安全研究, 2024, 10(12): 1153-.

References

［1］Ji Ziwei, Lee N, Frieske R, et al. Survey of hallucination in natural language generation［J］. ACM Compting Surveys, 2023, 55(12): 138［2］Jabbour S, Fouhey D, Shepard S, et al. Measuring the impact of AI in the diagnosis of hospitalized patients: A randomized clinical vignette survey study［J］. American Medical Association, 2023, 330(23): 22752284［3］蚂蚁集团. OpenSPG介绍［EBOL］. ［20241111］.https:spg.openkg.cn［4］蚂蚁集团. 原生安全范式框架v1.0［EBOL］. ［20241111］. https:www.163.comdyarticleIE7DV487051180F7.html［5］Sander S. 注入攻击［EBOL］. ［20241111］. https:learnprompting.orgdocsprompt_hackinginjection［6］Wei J, Wang Xuezhi, Schuurmans D, et al. Chainofthought prompting elicits reasoning in large language models［C］ Proc of Advances in Neural Information Processing Systems 35. Cambridge, MA: MIT Press, 2022: 2482424837［7］Wang Lei, Xu Wangyu, Lan Yihuai, et al. Planandsolve prompting: Improving zeroshot chainofthought reasoning by large language models［C］ Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Cambridge, MA: MIT Press, 2023: 26092634［8］Nakano R, Hilton J, Balaji S, et al. WebGPT: Browserassisted questionanswering with human feedback［J］. arXiv preprint, arXiv:2112.09332, 2021［9］Fan A, Jernite Y, Perez E, et al. ELI5: Long form question answering［C］ Proc of the 57th Conf of the Association for Computational Linguistics. Cambridge, MA: MIT Press, 2019: 35583567［10］Yao Shunyu, Zhao J, Yu Dian, et al. ReAct: Synergizing reasoning and acting in language models［J］. arXiv preprint, arXiv:2312.04511, 2023［11］Kim S, Moon S, Tabrizi, et al. An LLM compiler for parallel function calling［J］. arXiv preprint, arXiv:2303.11366v2, 2023［12］Shinn N, Cassano F, Gopinath A, et al. Reflexion: Language agents with verbal reinforcement learning［C］ Proc of Advances in Neural Information Processing Systems 36. New York: ACM, 2023: 86348652［13］Chu Zhixuan, Wang Yan, Zhu Fent, et al. Professional agents—Evolving large language models into autonomous experts with humanlevel competencies［J］. arXiv preprint, arXiv:2402.03628, 2024［14］Chu Zhixuan, Hu Mengxuan, Cui Qing, et al. Taskdriven causal feature distillation: Towards trustworthy risk prediction［C］ Proc of 38th AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2024: 1164211650［15］Feng Yu, Zhou Ben, Lin Weidong, et al. BIRD: A trustworthy bayesian inference framework for large language models［J］. arXiv preprint, arXiv: 2404.12494, 2024［16］Deng Gelei, Liu Yi, Li Yuekang, et al. MASTERKEY: Automated jailbreaking of large language model chatbots［J］. arXiv preprint, arXiv:2307.08715v2, 2023［17］Wen Yuxin, Jain N, Kirchenbauer J, et al. Hard prompts made easy: Gradientbased discrete optimization for prompt tuning and discovery［C］ Proc of Advances in Neural Information Processing System. New York: ACM, 2023: 5100851025［18］Gustavo S, Pearce H, Nys T, et al. Lost at c: A user study on the security implications of large language model code assistants［C］ Proc of the 32nd USENIX Security Symposium (USENIX Security 23). Berkeley, CA: USENIX Association, 2023: 22052222

[1]	Ji Xu, Zhang Jianyi, Zhao Zhangchi, Zhou Ziyin, Li Yilong, and Sun Zezheng. A Large Language Model Detection System for Domainspecific Jargon [J]. Journal of Information Security Reserach, 2024, 10(9): 795-.
[2]	. Vehicle CAN Intrusion Detection Method Based on MobileViT Lightweight Network [J]. Journal of Information Security Reserach, 2024, 10(5): 411-.
[3]	. Research Progress on Large Language Models in the Generation of Threat Intelligence [J]. Journal of Information Security Reserach, 2024, 10(11): 1028-.
[4]	. A Retrospective and Future Development Study of Zero Trust Architecture [J]. Journal of Information Security Reserach, 2024, 10(10): 896-.
[5]	. The Innovation and Practice of the Talent Training Mode of Cyber Security with the Features of ScienceEducation Integration [J]. Journal of Information Security Reserach, 2023, 9(9): 921-.
[6]	. Data Scarcity and Large Language Model Data Value Asymmetry [J]. Journal of Information Security Reserach, 2023, 9(7): 637-.
[7]	. ChatGPT’s Applications, Status and Trends in the Field of Cyber Security [J]. Journal of Information Security Reserach, 2023, 9(6): 500-.
[8]	. Research on Content Detection Generated by Large Language Model and the Mechanism of Bypassing [J]. Journal of Information Security Reserach, 2023, 9(6): 524-.
[9]	. Practical Exploration and Research on Automotive Cyber Security [J]. Journal of Information Security Reserach, 2023, 9(5): 476-.
[10]	. Research on Integrated Scheduling Method of Safety and Security Tasks for Intelligent Instruments in Industrial Internet [J]. Journal of Information Security Reserach, 2023, 9(4): 321-.
[11]	. Analysis on Development of Newtype Smart City Cyber Security [J]. Journal of Information Security Reserach, 2022, 8(9): 947-.
[12]	. A Vulnerability Management System Based on Multiconstrained Secure Workflow [J]. Journal of Information Security Reserach, 2022, 8(7): 700-.
[13]	. [J]. Journal of Information Security Reserach, 2022, 8(6): 586-.
[14]	. Exploration and Practice of Classified Protection 2.0 System Construction Under the New Situation [J]. Journal of Information Security Reserach, 2022, 8(2): 196-.
[15]	. Survey of Network Intrusion Detection Based on Deep Learning [J]. Journal of Information Security Reserach, 2022, 8(12): 1163-.

A Trust Framework for Large Language Model Application

大模型应用可信框架研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics