Journal of Information Security Reserach ›› 2026, Vol. 12 ›› Issue (1): 68-.

Previous Articles     Next Articles

Copyright Open Licensing Rules and Their Implementation Paths in Data Training

Li Qian and Shen Lisu   

  1. (Law School, Kunming University of Science and Technology, Kunming 650500)
  • Online:2026-01-10 Published:2026-01-10

数据训练中的版权开放许可规则及其实现路径

李倩沈立苏   

  1. (昆明理工大学法学院昆明650500)
  • 通讯作者: 李倩 博士,副教授.主要研究方向为民商法学与科技法学. lqxnzfdx@163.com
  • 作者简介:李倩 博士,副教授.主要研究方向为民商法学与科技法学. lqxnzfdx@163.com 沈立苏 硕士研究生.主要研究方向为民商法学与科技法学. 3107168369@qq.com

Abstract: The reliance of generative artificial intelligence training on massive volumes of copyrighted works has given rise to increasingly significant risks of copyright infringement. Jurisdictions such as the European Union, the United States, and Japan have introduced regulatory responses, including innovative rules on text and data mining exceptions. Although allowing the use of copyrighted works for data training has become a general theoretical consensus in China, there remains considerable controversy over the specific pathways to compliance. This article argues for the establishment of a copyright open licensing mechanism for data training, replacing individualized authorization with voluntary public declarations, and incentivizing right holders’ participation through fair benefit allocation and transparent regulatory safeguards. This approach aims to strike a dynamic balance between technological innovation and copyright protection. Given the automatic protection and vast quantity of copyrighted works, the legal effect of publicity of open licensing declarations should be expressly recognized to protect bona fide thirdparty reliance. Additionally, right holders should be permitted to grant collective licenses for series or sets of works to better accommodate the dataintensive utilization demands in the era of artificial intelligence.

Key words: data training, text and data mining, copyright, open licensing, legal effect of publicity

摘要: 生成式人工智能训练对海量作品的依赖引发版权侵权风险,欧盟、美国与日本等法域通过创新文本与数据挖掘例外等规则予以规制.尽管适当允许利用作品进行数据训练已基本成为国内理论共识,但其具体的合规路径仍存在较大争议.研究发现,应在数据训练中引入版权开放许可机制,以自主声明替代逐件授权,并通过合理利益分配与透明监管体系激励权利人参与,构建权利保护与技术创新的动态平衡.基于作品自动受保护、数量庞杂的特点,应明确版权开放许可声明的公示效力,保护善意第三人的信赖利益,并允许版权人对其系列作品进行集合许可,以更好适应智能时代数据密集型利用的现实需求.

关键词: 数据训练, 文本与数据挖掘, 著作权, 开放许可, 公示效力

CLC Number: