Journal of Information Security Reserach ›› 2026, Vol. 12 ›› Issue (5): 420-.

Previous Articles     Next Articles

Research on Harmful Website Detection Based on Graph Neural Network and Multifeature Fusion

Qu Miaozhang1, Shi Zhibin1, Chang Zhaoyu1, and Zhang Wei2   

  1. 1(School of Computer Science and Technology, North University of China, Taiyuan 030051)
    2(School of Computer Information Engineering, Shanxi Technology and Business University, Taiyuan 030032)
  • Online:2026-05-23 Published:2026-05-23

基于图神经网络和多特征融合的有害网站检测研究

瞿淼樟1师智斌1常赵宇1张薇2   

  1. 1(中北大学计算机科学与技术学院太原030051)
    2(山西工商学院计算机信息工程学院太原030032)
  • 通讯作者: 师智斌 博士,副教授,主要研究方向为网络安全. 1637350520@qq.com
  • 作者简介:瞿淼樟 硕士研究生.主要研究方向为网络安全. qumz8109@163.com 师智斌 博士,副教授,主要研究方向为网络安全. 1637350520@qq.com 常赵宇 硕士研究生.主要研究方向为网络安全. czy2514881241@163.com 张薇 硕士.主要研究方向为网络安全及数据挖掘. 254315045@qq.com

Abstract: To address the limitations of current harmful website detection methods in deep text semantic mining and multimodal feature coperception, this study proposes a multifeature fusion detection model based on graph attention networks (GAT) and ConvNeXt. The framework leverages GloVe word embeddings to construct semantic representations of website text, mapping it into a graph structure based on word cooccurrence relationships. The adaptive attention mechanism in GAT dynamically captures contextual dependencies between noncontiguous words, while ConvNeXt extracts both local details and global contextual features from website images. A crossattentionbased fusion module facilitates dynamic textimage feature alignment and interactive integration. Experimental results demonstrate that the proposed model achieves 99.10% accuracy in fourcategory website classification, significantly enhancing detection performance. This work offers valuable insights for identifying harmful online content and enhancing cybersecurity governance.

Key words: harmful website detection, graph neural network, multifeature fusion, GAT, ConvNeXt, crossattention

摘要: 针对当前有害网站检测方法在文本深度语义挖掘与多特征协同感知方面的不足,提出一种基于图注意力网络与ConvNeXt的多特征融合检测模型GATConvNeXt.通过GloVe(global vectors for word representation)词嵌入技术构建网站文本的语义表征,并基于词共现关系将文本映射为图结构,利用图注意力网络的自适应注意力机制动态捕捉非连续词汇间的潜在关联,采用ConvNeXt提取网站图像的局部细节与全局上下文信息,设计基于交叉注意力的多特征融合模块,实现文本与图像特征的动态对齐与交互.实验结果表明,该模型在网站4分类任务中准确率达到99.10%,显著提升检测精度,对网络有害内容识别与安全治理具有重要参考价值.

关键词: 有害网站检测, 图神经网络, 多特征融合, 图注意力网络, ConvNeXt, 交叉注意力机制

CLC Number: