A Compressionrobust Video Watermarking Method Based on Multiscale Convolutional Attention and Dualbranch Adversarial Training#br#

	#br#

Journal of Information Security Reserach ›› 2026, Vol. 12 ›› Issue (5): 463-.

Previous Articles Next Articles

A Compressionrobust Video Watermarking Method Based on Multiscale Convolutional Attention and Dualbranch Adversarial Training#br#
#br#

Zhu Shunzhe1, Niu Ke1,2, Lu Yihang1, Xu Qianhui1, and Li Jun1,2

1(School of Cryptography Engineering, Engineering University of PAP, Xi’an 710016)
2(Key Laboratory of Information Security of People’s Armed Police (Engineering University of PAP), Xi’an 710016)

Online:2026-05-23 Published:2026-05-23

基于多尺度卷积注意力和双分支对抗训练的抗压缩鲁棒视频水印

朱顺哲1钮可1,2卢艺航1徐千惠1李军1,2

1(中国人民武装警察部队工程大学密码工程学院西安710016)
2(武警部队信息安全重点实验室(中国人民武装警察部队工程大学)西安710016)

通讯作者: 钮可博士，教授.主要研究方向为信息隐藏、多媒体安全. niuke@163.com
作者简介:朱顺哲硕士研究生.主要研究方向为视频信息隐藏与人工智能. 18700909363@163.com 钮可博士，教授.主要研究方向为信息隐藏、多媒体安全. niuke@163.com 卢艺航硕士研究生.主要研究方向为视频信息隐藏与人工智能. luyihang135@163.com 徐千惠硕士研究生.主要研究方向为视频信息隐藏与人工智能. xu2000qianhui@163.com 李军博士，讲师.主要研究方向为信息隐藏. lijun9250lj@163.com

Abstract

Abstract: To overcome the limitations of current deep learningbased video watermarking methods, such as reliance on singlescale feature extraction, limited adversarial training mechanisms, and insufficient robustness against compression, this paper proposes a robust video watermarking model called MSCAGAN (multiscale convolutional attention generative adversarial network), which integrates a multiscale convolutional attention mechanism and a dualbranch adversarial training framework. The model employs a lightweight multiscale attention module to extract key features form video frames at both local and global perspectives. Combined with depthwise separable convolution, it reduces computational complexity while achieving precise localization and strength control of watermark embedding, thereby enhancing invisibility. This paper innovatively designs a dualbranch adversarial training structure, in which a learnable adversary network is introduced to simulate realworld attacks, enhancing the model’s robustness against common threats such as compression, cropping, and scaling. Experimental results demonstrate that the watermarked videos generated by MSCAGAN achieve an average PSNR of 44.61dB and a SSIM of 0.964, significantly outperforming existing methods. Under H.264 compression, the average decoding accuracy reaches 94.01%. Moreover, the model maintains strong robustness even under severe cropping and scaling attacks. In summary, MSCAGAN provides an efficient and reliable solution for multimedia content copyright protection. It has the potential to be extended to emerging coding standards such as H.265, further enhancing its robustness in complex application scenarios.

Key words: video watermarking, compression resistance, robustness, multiscale convolutional attention, dualbranch adversarial training

摘要： 针对当前基于深度学习的视频水印方法普遍依赖单一尺度特征提取、对抗训练机制功能受限及抗压缩性能不足等问题，提出了一种融合多尺度卷积注意力机制与双分支对抗训练框架的鲁棒视频水印模型MSCAGAN(multiscale convolutional attention generative adversarial network).该模型通过轻量级多尺度注意力模块，从局部到全局尺度提取视频帧的关键特征，并结合深度可分离卷积以降低计算复杂度，实现对水印嵌入区域的精准定位与强度控制，从而提升不可见性.同时，创新性地设计了一种双分支对抗训练结构，通过引入可学习的对手网络模拟真实攻击行为，增强模型在面对压缩、裁剪等常见攻击时的鲁棒性.实验结果显示，MSCAGAN生成的水印视频平均峰值信噪比(PSNR)为44.61dB，结构相似性指数(SSIM)为0.964，显著优于现有方法；在H.264压缩测试中，平均解码准确率达94.01%；在裁剪和缩放攻击下也表现出较强的鲁棒性.综上，该模型为多媒体内容版权保护提供了一种高效可靠的解决方案，未来可扩展至H.265等新型编码标准，进一步提升其在复杂场景下的鲁棒表现.

关键词: 视频水印, 抗压缩, 鲁棒性, 多尺度卷积注意力, 双分支对抗训练

CLC Number:

TP309

朱顺哲, 钮可, 卢艺航, 徐千惠, 李军, . 基于多尺度卷积注意力和双分支对抗训练的抗压缩鲁棒视频水印[J]. 信息安全研究, 2026, 12(5): 463-.

References

［1］汤宗伟. 基于生成式对抗网络的医疗图像鲁棒水印方法研究［D］. 开封: 河南大学, 2024［2］蒋睿. 基于深度学习的水印与取证算法研究［D］. 南京: 南京信息工程大学, 2024［3］张汝波, 常世淇, 张天一. 基于深度学习的图像信息隐藏方法综述［J］. 吉林大学学报: 工学版, 2025, 55(5): 14971515［4］Zhu J, Kaplan R, Johnson J, et al. Hidden: Hiding data with deep networks［C］ Proc of the European Conf on Computer Vision (ECCV). Berlin: Springer, 2018: 657672［5］Jia Z, Fang H, Zhang W. MBRS: Enhancing robustness of dnnbased watermarking by minibatch of real and simulated jpeg compression［C］ Proc of the 29th ACM Int Conf on Multimedia. New York: ACM, 2021: 4149［6］王翌妃, 周杨铭, 钱振兴, 等. 鲁棒视频水印研究进展［J］. 中国图象图形学报, 2022, 27(1): 2742［7］柯泽辉. 基于自编码网络的抗压缩视频隐式水印算法研究［D］. 广州: 华南理工大学, 2023［8］Mansour S, Jabra S B,Zagrouba E. A comprehensive review of video watermarking technique in deep learning environments［C］ Proc of the 2023 Int Conf on Cyberworlds (CW). Piscataway, NJ: IEEE, 2023: 7077［9］左涛. 基于深度学习的鲁棒视频水印技术研究与应用［D］. 济南: 齐鲁工业大学, 2024［10］Zhang K A, Xu L, CuestaInfante A, et al. Robust invisible video watermarking with attention［J］. arXiv preprint, arXiv:1909.01285, 2019［11］Luo X, Li Y, Chang H, et al. Dvmark: A deep multiscale framework for video watermarking［J］. IEEE Trans on Image Processing, 2023, 32(9): 56785690［12］Zhang Y, Ni J, Su W, et al. A novel deep video watermarking framework with enhanced robustness to H.264AVC compression［C］ Proc of the 31st ACM Int Conf on Multimedia. New York: ACM, 2023: 80958104［13］Jang M H, Jang Y, Lee J H, et al. LVMark: Robust watermark for latent video diffusion models［J］. arXiv preprint, arXiv:2412.09122, 2024［14］Hu R, Zhang J, Li Y, et al. VideoShield: Regulating diffusionbased video generation models via watermarking［J］. arXiv preprint, arXiv:2501.14195, 2025［15］Bistroń M, Piotrowski Z. Efficient video watermarking algorithm based on convolutional neural networks with entropybased information mapper［J］. Entropy, 2023, 22(2): 3838［16］Jiang X, Sun T, Zhou Y, et al. A robust H.264AVC video watermarking scheme with drift compensation［JOL］. The Scientific World Journal, 2014 ［20260412］. https:doi.org10.11552014802347［17］Goodfellow I, PougetAbadie J, Mirza M, et al. Generative adversarial nets［C］ Advances in Neural Information Processing Systems 27. Red Hook, NY: Curran Associates, 2014: 26722680［18］Chollet F. Xception: Deep learning with depthwise separable convolutions［C］ Proc of the IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2017: 12511258［19］Marszalek M, Laptev I, Schmid C. Actions in context［C］ Proc of the 2009 IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2009: 29292936

[1]	. Image Encryption Method Based on Novel Combined Chaotic System and Fractional Number Theory Transformation [J]. Journal of Information Security Reserach, 2026, 12(6): 517-.
[2]	. A Survey on Backdoor Attacks and Defenses in Federated Learning [J]. Journal of Information Security Reserach, 2025, 11(9): 778-.
[3]	. Malicious Client Detection and Defense Method for Federated Learning [J]. Journal of Information Security Reserach, 2024, 10(2): 163-.
[4]	. AI Security—Research and Application on Adversarial Example [J]. Journal of Information Security Research, 2019, 5(11): 1000-1007.
[5]	. A New Design and Implementation of Digital Watermark Algorithm based on Histogram [J]. Journal of Information Security Research, 2018, 4(11): 1052-1058.

A Compressionrobust Video Watermarking Method Based on Multiscale Convolutional Attention and Dualbranch Adversarial Training#br#
#br#

基于多尺度卷积注意力和双分支对抗训练的抗压缩鲁棒视频水印

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Recommended Articles

Metrics

A Compressionrobust Video Watermarking Method Based on Multiscale Convolutional Attention and Dualbranch Adversarial Training#br# #br#

基于多尺度卷积注意力和双分支对抗训练的抗压缩鲁棒视频水印

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Recommended Articles

Metrics

A Compressionrobust Video Watermarking Method Based on Multiscale Convolutional Attention and Dualbranch Adversarial Training#br#
#br#