[1] 刘全, 翟建伟, 章宗长,等. 深度强化学习综述[J]. 计算机学报, 2018, 41(1): 1-27.
[2] Zhu Kai, Zhang Tao. Deep reinforcement learning based mobile robot navigation: a review [J]. Tsinghua Science and Technology, 2021, 26(5): 674-691
[3] 徐金才, 任民, 李琦,等. 图像对抗样本的安全性研究概述[J]. 信息安全研究, 2021, 7(4): 294-309.
[4] 陈岳峰, 毛潇锋, 李裕宏,等. AI安全--对抗样本技术综述与应用[J]. 信息安全研究, 2019, 5(11): 1000-1007.
[5] 张强, 杨吉斌, 张雄伟,等. 基于生成对抗网络的音频目标分类对抗[J]. 南京大学学报:自然科学版,2021,57(5): 793-800.
[6] Huang S , Papernot N , Goodfellow I , et al. Adversarial attacks on neural network policies [C] //Proc of the 5th Int Conf on Learning Representations. La Jolla, CA: ICLR, 2017.
[7] Schulman J , Wolski F , Dhariwal P, et al. Proximal policy optimization algorithms [J]. arXiv preprint arXiv: 1707.06347, 2017.
[8] Papernot N , Mcdaniel P , Jha S, et al. The limitations of deep learning in adversarial settings [C] //Proc of the 2016 IEEE European Symposium on Security and Privacy. Piscataway,NJ: IEEE, 2015.
[9] Rone W, Ben-Tzvi P. Mapping, localization and motion planning in mobile multi-robotic systems [J]. Robotica, 2013, 31(1):1-23.
[10]赵星宇, 丁世飞. 深度强化学习研究综述[J]. 计算机科学, 2018, 45(7):1-6.
[11] Tai Lei , Paolo G , Liu Ming. Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation [C] //IEEE International Conference on Intelligent Robots and Systems. Piscataway. NJ: IEEE, 2017: 31-36.
[12]Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks [C] //2nd International Conference on Learning Representations. La Jolla, CA:ICLR, 2014.
[13]Behzadan V , Munir A. Vulnerability of deep reinforcement learning to policy induction attacks [G] // LNAI 10358: Proc of the 13th Int Conf on Machine Learning and Data Mining in Pattern Recognition. New York: ACM, 2017: 262-275.
[14]Lin Yenchen, Hong Zhangwei, Liao Yuanhong, et al. Tactics of adversarial attack on deep reinforcement learning agents [C] //Proc of the IJCAI Int Joint Conf on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann, 2017: 3756-3762.
[15]Kos J, Song D. Delving into adversarial attacks on deep policies [C] //Proc of the 5th Int Conf on Learning Representations, La Jolla, CA: ICLR 2017.
[16]Hussenot L, Geist M, Pietquin O. Copycat: Taking control of neural policies with constant attacks [C]// Proc of the Int Joint Conf on Autonomous Agents and Multiagent Systems. New York: Springer, 2020: 548-556.
[17]Chen Tong, Niu Wenjia, Xiang Yingxiao, et al. Gradient band-based adversarial training for generalized attack immunity of a3c path finding [J]. arXiv preprint arXiv:1807.06752, 2018.
[18]Baixiaoxuan, NIU Wenjia, LIU Jiqiang, et al. Adversarial examples construction towards white-box q table variation in dqn pathfinding training [C] //Proc of the 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). Piscataway, NJ: IEEE, 2018: 781-787.
[19]钱亚冠, 张锡敏, 王滨,等. 基于二阶对抗样本的对抗训练防御[J]. 电子与信息学报, 2021,43(11):3367-3373.
|