Journal of Information Security Reserach ›› 2024, Vol. 10 ›› Issue (7): 616-.

Previous Articles     Next Articles

Federated Foundation Model Finetuning Based on Differential Privacy#br#
#br#

Zeng Hui1,2, Xiong Shiyu1,2, Di Yongzheng1,2, and Shi Hongzhou1   

  1. 1(Beijing Key Laboratory of Mobile Computing and Pervasive Device (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190)
    2(College of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190)

  • Online:2024-07-14 Published:2024-07-18

基于差分隐私的联邦大模型微调技术

曾辉1,2熊诗雨1,2狄永正1,2史红周1   

  1. 1(移动计算与新型终端北京市重点实验室(中国科学院计算技术研究所)北京100190)
    2(中国科学院大学计算机科学与技术学院北京100190)
  • 通讯作者: 史红周 博士,高级工程师.主要研究方向为物联网安全、大数据技术、信息安全. hzshi@ict.ac.cn
  • 作者简介:曾辉 硕士研究生.主要研究方向为联邦学习、大模型参数微调. hunk.tsang@gmail.com 熊诗雨 硕士研究生.主要研究方向为目标检测、联邦学习. 18325056070@163.com 狄永正 硕士研究生.主要研究方向为联邦学习、大模型参数微调. diyongzheng23@mails.ucas.ac.cn 史红周 博士,高级工程师.主要研究方向为物联网安全、大数据技术、信息安全. hzshi@ict.ac.cn

Abstract: As the availability of private data decreases, large model finetuning based on federated learning has become a research area of great concern. Although federated learning itself has a certain degree of privacy protection, privacy security issues such as gradient leakage attacks and embedding inversion attacks on large models still threaten the sensitive information of participants. In the current context of increasing awareness of privacy protection, these potential privacy risks have significantly hindered the promotion of large model finetuning based on federated learning in practical applications. Therefore, this paper proposes a federated large model embedding differential privacy control algorithm, which adds controllable random noise to the embedded model of the large model during efficient parameter finetuning process through a global and local dual privacy control mechanism to enhance the privacy protection ability of federated learning based large model parameter finetuning. In addition, this paper demonstrates the privacy protection effect of this algorithm in large model finetuning through experimental comparisons of different federation settings, and verifies the feasibility of the algorithm through performance comparison experiments between centralization and federation.

Key words: federated learning, foundation model, parameterefficient finetuning, differential privacy, data privacy leakage

摘要: 随着私有数据可用性的降低,基于联邦学习的大模型参数微调成为备受关注的研究领域.尽管联邦学习本身具有一定程度的隐私保护能力,但其中的梯度泄露攻击和针对大模型的嵌入反转攻击等隐私安全问题仍然威胁着参与者的敏感信息.在当前对隐私保护意识不断增强的背景下,这些潜在的隐私风险显著阻碍了基于联邦学习的大模型参数微调在实际应用中的推广.因此,提出一种联邦大模型嵌入差分隐私控制算法,通过全局和本地双重隐私控制机制,在高效参数微调过程中为大模型的嵌入模型添加可控的随机噪声,以增强基于联邦学习的大模型参数微调的隐私保护能力.此外,通过对不同联邦设置的实验比较,展示了该算法在大模型参数微调中的隐私保护效果,并通过中心化和联邦化的性能比较实验验证了该算法的可行性.

关键词: 联邦学习, 大模型, 高效参数微调, 差分隐私, 数据隐私泄露

CLC Number: