A Review of GPU Acceleration Technology for Deep Learning in Plaintext  and Private Computing Environments

Journal of Information Security Reserach ›› 2024, Vol. 10 ›› Issue (7): 586-.

A Review of GPU Acceleration Technology for Deep Learning in Plaintext and Private Computing Environments

Qin Zhixiang1, Yang Hongwei1, Hao Meng1, He Hui1, and Zhang Weizhe1,2,3#br#

#br#

1(School of Cyberspace Science, Harbin Institute of Technology, Harbin 150001)
2(School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055)
3(Department of New Networks, Peng Cheng Laboratory, Shenzhen, Guangdong 518055)

Online:2024-07-14 Published:2024-07-14

隐私计算环境下深度学习的GPU加速技术综述

秦智翔1杨洪伟1郝萌1何慧1张伟哲1,2,3

1(哈尔滨工业大学网络空间安全学院哈尔滨150001)
2(哈尔滨工业大学(深圳)计算机科学与技术学院广东深圳518055)
3(鹏城实验室新型网络研究部广东深圳518055)

通讯作者: 杨洪伟博士，助理研究员.主要研究方向为数据挖掘、隐私计算、网络空间安全. yanghongwei@hit.edu.cn
作者简介:秦智翔硕士研究生.主要研究方向为安全多方计算、大数据安全. qzxqzc@gmail.com 杨洪伟博士，助理研究员.主要研究方向为数据挖掘、隐私计算、网络空间安全. yanghongwei@hit.edu.cn 郝萌博士，讲师.主要研究方向为高性能计算、并行应用性能优化. haomeng@hit.edu.cn 何慧博士，教授.主要研究方向为云计算、数据安全与隐私保护、网络空间安全. hehui@hit.edu.cn 张伟哲博士，教授.主要研究方向为网络空间安全、数据安全、高性能计算. wzzhang@hit.edu.cn

Abstract

Abstract: With the continuous development of deep learning technology, the training time of neural network models is getting longer and longer, and using GPU computing to accelerate neural network training has increasingly become a key technology. In addition, the importance of data privacy has also promoted the development of private computing technology. This article first introduces the concepts of deep learning, GPU computing, and two privacy computing technologies, secure multiparty computing and homomorphic encryption, and then discusses the GPU acceleration technology of deep learning in plaintext environment and private computing environment. In the plaintext environment, the two basic deep learning parallel training modes of data parallelism and model parallelism are introduced, two different memory optimization technologies of recalculation and video memory swapping are analyzed, and gradient compression in the training process of distributed neural network is introduced. technology. This paper introduces two deep learning GPU acceleration techniques: Secure multiparty computation and homomorphic encryption in a privacy computing environment. Finally, the similarities and differences of GPUaccelerated deep learning methods in the two environments are briefly analyzed.

Key words: deep learning, GPU (graphics processing unit) computing, private computing, secure multiparty computation, homomorphic encryption

摘要： 随着深度学习技术的不断发展，神经网络模型的训练时间越来越长，使用GPU计算对神经网络训练进行加速便成为一项关键技术.此外，数据隐私的重要性也推动了隐私计算技术的发展.首先介绍了深度学习、GPU计算的概念以及安全多方计算、同态加密2种隐私计算技术，而后探讨了明文环境与隐私计算环境下深度学习的GPU加速技术.在明文环境下，介绍了数据并行和模型并行2种基本的深度学习并行训练模式，分析了重计算和显存交换2种不同的内存优化技术，并介绍了分布式神经网络训练过程中的梯度压缩技术.介绍了在隐私计算环境下安全多方计算和同态加密2种不同隐私计算场景下的深度学习GPU加速技术.简要分析了2种环境下GPU加速深度学习方法的异同.

关键词: 深度学习, GPU计算, 隐私计算, 安全多方计算, 同态加密

CLC Number:

TP309

秦智翔, 杨洪伟, 郝萌, 何慧, 张伟哲, . 隐私计算环境下深度学习的GPU加速技术综述[J]. 信息安全研究, 2024, 10(7): 586-.

References

［1］Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets［J］. Neural Computation, 2006, 18(7): 15271554［2］Sanders J, Kandrot E. GPU高性能编程CUDA实战［M］. 聂雪军, 译. 北京: 机械工业出版社, 2011［3］沈传年, 徐彦婷, 陈滢霞. 隐私计算关键技术及研究展望［J］. 信息安全研究, 2023, 9(8): 714721［4］Yao A C. Protocols for secure computations［C］ Proc of the 23rd Annual Symp on Foundations of Computer Science (SFCS 1982). Piscataway, NJ: IEEE, 1982: 160164［5］Beaver D. Efficient multiparty protocols using circuit randomization［C］ Advances in Cryptology—CRYPTO’91: Proceedings 11. Berlin: Springer, 1992: 420432［6］Zhang Feng, Chen Zheng, Zhang Chenyang, et al. An efficient parallel secure machine learning framework on GPUs［J］. IEEE Trans on Parallel and Distributed Systems, 2021, 32(9): 22622276［7］Rivest R L, Adleman L, Dertouzos M L. On data banks and privacy homomorphisms［J］. Foundations of Secure Computation, 1978, 4(11): 169180［8］Fan J, Vercauteren F. Somewhat practical fully homomorphic encryption［J］. Cryptology ePrint Archive, 2012: 144［9］Brakerski Z, Gentry C, Vaikuntanathan V. (Leveled) fully homomorphic encryption without bootstrapping［J］. ACM Trans on Computation Theory, 2014, 6(3): 136［10］Cheon J H, Kim A, Kim M, et al. Homomorphic encryption for arithmetic of approximate numbers［C］ Advances in Cryptology—ASIACRYPT 2017. Berlin: Springer, 2017: 409437［11］边松, 毛苒, 朱永清, 等. 全同态加密软硬件加速研究进展［J］. 电子与信息学报, 2024, 46(5): 116［12］卢凯, 赖志权, 李笙维, 等. 并行智能训练技术: 挑战与发展［J］. 中国科学: 信息科学, 2023, 53(8): 14411468［13］杜海舟, 黄晟. 分布式机器学习中的通信机制研究综述［J］. 上海电力大学学报, 2021, 37(5): 496500, 511［14］Chen Tianqi, Xu Bing, Zhang Chiyuan, et al. Training deep nets with sublinear memory cost［J］. arXiv preprint, arXiv: 1604.06174, 2016［15］马玮良, 彭轩, 熊倩, 等. 深度学习中的内存管理问题研究综述［J］. 大数据, 2020, 6(4): 5668［16］Korthikanti V A, Casper J, Lym S, et al. Reducing activation recomputation in large transformer models［J］. arXiv preprint, arXiv: 2205.05198, 2022［17］Guo Jinrong, Liu Wantao, Wang Wang, et al. AccUDNN: A GPU memory efficient accelerator for training ultradeep neural networks［C］ Proc of the 37th IEEE Int Conf on Computer Design (ICCD). Piscataway, NJ : IEEE, 2019: 6572［18］Shi Shaohuai, Wang Qiang, Chu Xiaowen, et al. Communicationefficient distributed deep learning with merged gradient sparsification on GPUs［C］ Proc of IEEE Conf on Computer Communications(IEEE INFOCOM 2020). Piscataway, NJ: IEEE, 2020: 406415［19］Tan S, Knott B, Tian Y, et al. CryptGPU: Fast privacypreserving machine learning on the GPU［C］ Proc of 2021 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2021: 10211038［20］Watson J L, Wagh S, Popa R A. Piranha: A {GPU} platform for secure computation［C］ Proc of the 31st USENIX Security Symp (USENIX Security 22). Berkeley, CA: USENIX Association, 2022: 827844［21］Jiang Wuxuan, Song Xiangjun, Hong Shenbai, et al. Spin: An efficient secure computation framework with GPU acceleration［J］. arXiv preprint, arXiv: 2402.02320, 2024［22］Thakkar V, Ramani P, Cecka C, et al. CUTLASS［CPOL］. ［20240525］. https:github.comNVIDIAcutlass［23］Shi R, Potluri S, Hamidouche K, et al. Designing efficient small message transfer mechanism for internode MPI communication on InfiniBand GPU clusters［C］ Proc of the 21st Int Conf on High Performance Computing (HiPC). Piscataway, NJ: IEEE, 2014: 110［24］Bell N, Garland M. Efficient sparse matrixvector multiplication on CUDA, NVR2008004［R］. Nvidia Technical Report, Santa Clara: Nvidia Corporation, 2008［25］Fan Shengyu, Wang Zhiwei, Xu Weizhi, et al. Tensorfhe: Achieving practical computation on encrypted data using gpgpu［C］ Proc of 2023 IEEE Int Symp on HighPerformance Computer Architecture (HPCA). Piscataway, NJ: IEEE, 2023: 922934［26］Wang Zhiwei, Li Peinan, Hou Rui, et al. HEBooster: An efficient polynomial arithmetic acceleration on GPUs for fully homomorphic encryption［J］. IEEE Trans on Parallel and Distributed Systems, 2023, 34(4): 10671081［27］Wang Guibin, Lin Yisong, Yi Wei. Kernel fusion: An effective method for better power efficiency on multithreaded GPU［C］ Proc of 2010 IEEEACM Int Conf on Green Computing and Communications & Int Conf on Cyber, Physical and Social Computing. Piscataway, NJ: IEEE, 2010: 344350［28］Al Badawi A, Jin C, Lin J, et al. Towards the alexnet moment for homomorphic encryption: Hcnn, the first homomorphic cnn on encrypted data with GPUs［J］. IEEE Trans on Emerging Topics in Computing, 2020, 9(3): 13301343［29］Deng Li. The mnist database of handwritten digit images for machine learning research［best of the web］［J］. IEEE Signal Processing Magazine, 2012, 29(6): 141142［30］Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images［R］. Toronto: University of Toronto, 2000［31］Al Badawi A, Veeravalli B, Lin J, et al. MultiGPU design and performance evaluation of homomorphic encryption on GPU clusters［J］. IEEE Trans on Parallel and Distributed Systems, 2020, 32(2): 379391

[1]	. Model of Intrusion Detection Based on Federated Learning and Convolutional Neural Network [J]. Journal of Information Security Reserach, 2024, 10(7): 642-.
[2]	. A Secure and Efficient Method of Fully Anonymous Vertical Federated Learning [J]. Journal of Information Security Reserach, 2024, 10(6): 506-.
[3]	. Homomorphic Encryption Scheme Based on Commercial Cryptography SM9#br# #br# [J]. Journal of Information Security Reserach, 2024, 10(6): 513-.
[4]	. An Automatic Vulnerability Classification Framework Based on BiGRU TextCNN [J]. Journal of Information Security Reserach, 2024, 10(5): 446-.
[5]	. Adversarial Attack Algorithm Based on Multimodel Scheduling Optimization#br# #br# [J]. Journal of Information Security Reserach, 2024, 10(5): 403-.
[6]	. Research on Network Traffic Intrusion Detection Method Based on Denoising Diffusion Probability Model [J]. Journal of Information Security Reserach, 2024, 10(5): 421-.
[7]	. Research on Source Code Vulnerability Detection Based on BERT Model [J]. Journal of Information Security Reserach, 2024, 10(4): 294-.
[8]	. Research on Privacy Protection Technology in Federated Learning [J]. Journal of Information Security Reserach, 2024, 10(3): 194-.
[9]	. A Network Intrusion Detection Model Integrating CNN-BiGRU and Attention Mechanism [J]. Journal of Information Security Reserach, 2024, 10(3): 202-.
[10]	. Malicious TLS Traffic Detection Based on Graph Representation#br# #br# [J]. Journal of Information Security Reserach, 2024, 10(3): 209-.
[11]	. Malware Detection and Classification Based on GHM Visualization and Deep Learning [J]. Journal of Information Security Reserach, 2024, 10(3): 216-.
[12]	. Research on Location Attack Detection of VANET Based on Incremental Learning [J]. Journal of Information Security Reserach, 2024, 10(3): 277-.
[13]	. Face Recognition Privacy Protection Method Based on Homomorphic Encryption#br# [J]. Journal of Information Security Reserach, 2023, 9(9): 843-.
[14]	. Cyberbullying Detection Model Based on ELMoTextCNN [J]. Journal of Information Security Reserach, 2023, 9(9): 868-.
[15]	. Research on Blockchain Abnormal Transaction Detection Technology Based on LightGBM [J]. Journal of Information Security Reserach, 2023, 9(9): 877-.