Exploring Effective Factors Leading to Data Leakage in Pretrained #br#
Language Models#br#

	#br#

Journal of Information Security Reserach ›› 2025, Vol. 11 ›› Issue (2): 181-.

Previous Articles Next Articles

Exploring Effective Factors Leading to Data Leakage in Pretrained #br# Language Models#br#
#br#

Qian Hanwei1,2, Peng Jitian1, Yuan Ming1, Gao Guangliang1, Liu Xiaoqian1, Wang Qun1, and Zhu Jingyu1

1(Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing 210013)
2(The State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210093)

Online:2025-02-20 Published:2025-02-21

影响预训练语言模型数据泄露的因素研究

钱汉伟1,2彭季天1袁明1高光亮1刘晓迁1王群1朱景羽1

1(江苏警官学院计算机信息与网络安全系南京210013)
2(计算机软件新技术国家重点实验室(南京大学)南京210093)

通讯作者: 钱汉伟博士研究生，高级工程师.主要研究方向为深度学习、信息安全. qianhanwei@jspi.cn
作者简介:钱汉伟博士研究生，高级工程师.主要研究方向为深度学习、信息安全. qianhanwei@jspi.cn 彭季天主要研究方向为信息安全. 308631202@qq.com 袁明博士研究生，讲师.主要研究方向为深度学习、自然语言处理. yuanming@jspi.cn 高光亮博士，讲师.主要研究方向为复杂网络、自然语言处理. gaoguangliang@jspi.cn 刘晓迁博士，讲师.主要研究方向为数据挖掘与隐私保护. lxqlara@163.com 王群博士，教授.主要研究方向为信息安全、区块链. wangqun@jspi.cn 朱景羽主要研究方向为信息安全. 2830547419@qq.com

Abstract

Abstract: Currently, pretrained language models are widely used to learn general language representations from massive training corpora. The performance of downstream tasks in the field of natural language processing has been significantly improved after using the pretrained language model, but the overfitting phenomenon of the deep neural network makes the pretrained language model may have the risk of leaking the privacy of the training corpus. This paper selects T5, GPT, OPT and other widely used pretrained language models as research objects, and uses model inversion attacks to explore the factors that affect the data leakage of pretrained language models. During the experiment, the pretrained language model was used to generate a large number of samples, and the samples most likely to cause data leakage risk were selected for verification by indicators such as perplexity. It proved that different models such as T5 have different degrees of data leakage problems. For the same model, the larger size of the model, the scale, the greater the possibility of data leakage; adding a specific prefix makes it easier to obtain leaked data. The future data leakage problem and its defense methods are prospected.

Key words: , natural language processing, pretrained language models, private data leakage, model inversion attack, model architecture

摘要： 当前广泛使用的预训练语言模型是从海量训练语料中学习通用的语言表示.自然语言处理领域的下游任务在使用预训练语言模型后性能得到显著提升，但是深度神经网络过拟合现象使得预训练语言模型可能存在泄露训练语料隐私的风险.选用T5，GPT2，OPT等广泛使用的预训练语言模型作为研究对象，利用模型反演攻击探索影响预训练语言模型数据泄露的因素.实验过程中利用预训练语言模型生成大量样本，以困惑度等指标选取最有可能发生数据泄露风险的样本进行验证，证明了T5等不同模型均存在不同程度的数据泄露问题;同一种模型，模型规模越大数据泄露可能性越大;添加特定前缀更容易获取泄露数据等问题.对未来数据泄露问题及其防御方法进行了展望.

关键词: 自然语言处理, 预训练语言模型, 隐私数据泄露, 模型反演攻击, 模型架构

CLC Number:

TP183

钱汉伟, 彭季天, 袁明, 高光亮, 刘晓迁, 王群, 朱景羽, . 影响预训练语言模型数据泄露的因素研究[J]. 信息安全研究, 2025, 11(2): 181-.

References

［1］Shokri R, Stronati M, Song C, et al. Membership inference attacks against machine learning models［C］ Proc of 2017 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2017: 318［2］Melis L, Song C, De Cristofaro E, et al. Exploiting unintended feature leakage in collaborative learning［C］ Proc of 2019 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 691706［3］Carlini N, Tramer F, Wallace E, et al. Extracting training data from large language models［C］ Proc of USENIX Security Symposium. Berkeley, CA: USENIX Association, 2021: 26332650［4］Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners［EBOL］. ［20241201］. https:insightcivic.s3.useast1.amazonaws.comlanguagemodels.pdf［5］Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified texttotext transformer［J］. The Journal of Machine Learning Research, 2020, 21(1): 54855551［6］Zhang S, Roller S, Goyal N, et al. Opt: Open pretrained transformer language models［J］. arXiv preprint, arXiv:2205.01068, 2022［7］Devlin J. Bert: Pretraining of deep bidirectional transformers for language understanding［J］. arXiv preprint, arXiv:1810.04805, 2018［8］Brown T, Mann B, Ryder N, et al. Language models are fewshot learners［J］. Advances in Neural Information Processing Systems, 2020, 33: 18771901［9］曾辉, 熊诗雨, 狄永正, 等. 基于差分隐私的联邦大模型微调技术［J］. 信息安全研究, 2024, 10(7): 616623［10］Yeom S, Giacomelli I, Fredrikson M, et al. Privacy risk in machine learning: Analyzing the connection to overfitting［C］ Proc of the 31st IEEE Computer Security Foundations Symposium (CSF). Piscataway, NJ: IEEE, 2018: 268282.［11］Salem A, Zhang Y, Humbert M, et al. Mlleaks: Model and data independent membership inference attacks and defenses on machine learning models［J］. arXiv preprint, arXiv:1806.01246, 2018［12］Nasr M, Shokri R, Houmansadr A. Comprehensive privacy analysis of deep learning: Passive and active whitebox inference attacks against centralized and federated learning［C］ Proc of 2019 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 739753［13］Leino K, Fredrikson M. Stolen memories: Leveraging model memorization for calibrated {WhiteBox} membership inference［C］ Proc of the 29th USENIX Security Symposium (USENIX Security 20). Berkeley, CA: USENIX Association, 2020: 16051622［14］Elazar Y, Goldberg Y. Adversarial removal of demographic attributes from text data［J］. arXiv preprint, arXiv:1808.06640, 2018［15］Harnik D, Khaitzin E, Sotnikov D, et al. A fast implementation of deflate［C］ Proc of 2014 Data Compression Conference. Piscataway, NJ: IEEE, 2014: 223232［16］Chaudhuri S, Ganjam K, Ganti V, et al. Robust and efficient fuzzy match for online data cleaning［C］ Proc of the 2003 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2003: 313324

[1]	. Identitybased Content Extraction Signature Scheme on Idea Lattices [J]. Journal of Information Security Reserach, 2025, 11(1): 57-.
[2]	. Research on Source Code Vulnerability Detection Based on BERT Model [J]. Journal of Information Security Reserach, 2024, 10(4): 294-.
[3]	. Private Information Extraction Algorithm Incorporating Prior Structural Knowledge [J]. Journal of Information Security Reserach, 2024, 10(2): 139-.
[4]	. Research Progress on Large Language Models in the Generation of Threat Intelligence [J]. Journal of Information Security Reserach, 2024, 10(11): 1028-.
[5]	. Research on Vulnerability Text Feature Classification Technology Based on BERT [J]. Journal of Information Security Reserach, 2023, 9(7): 687-.
[6]	. Research on the Integration of Full Lifecycle Data Security Management and Artificial Intelligence Technology#br# [J]. Journal of Information Security Reserach, 2023, 9(6): 543-.
[7]	. Data Analysis of Top International Conferences on Cyberspace Security in Mainland China Based on Knowledge Graph [J]. Journal of Information Security Reserach, 2023, 9(2): 180-.
[8]	. Research on an OAuth2.1based Unified Authentication and Authorization Framework [J]. Journal of Information Security Reserach, 2022, 8(9): 879-.
[9]	. A Survey of Deep Face Forgery Detection [J]. Journal of Information Security Reserach, 2022, 8(3): 241-.
[10]	. Vulnerability Detection System of Transformer Substation Host Based on Port Scanning [J]. Journal of Information Security Reserach, 2022, 8(2): 182-.
[11]	. Credit Value Model of Internet of Vehicles Based on DAG Distributed Ledger [J]. Journal of Information Security Reserach, 2022, 8(1): 55-.
[12]	. Research on Memory Leak Detection Method Based on Internal Data Flow of Smart Grid [J]. Journal of Information Security Reserach, 2022, 8(1): 85-.
[13]	. Design of Risk Assessment Model Based on GB/T 31509—2015 [J]. Journal of Information Security Reserach, 2022, 8(1): 93-.
[14]	. Design of the standard architecture of the network security situation awareness [J]. Journal of Information Security Reserach, 2021, 7(9): 844-848.
[15]	. Edge-Cloud Synergy Information Security Protection Method for Industrial Control System Based on SDN under DDoS Attack [J]. Journal of Information Security Reserach, 2021, 7(9): 861-870.

Exploring Effective Factors Leading to Data Leakage in Pretrained #br# Language Models#br#
#br#

影响预训练语言模型数据泄露的因素研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Exploring Effective Factors Leading to Data Leakage in Pretrained #br# Language Models#br# #br#

影响预训练语言模型数据泄露的因素研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Exploring Effective Factors Leading to Data Leakage in Pretrained #br# Language Models#br#
#br#