Journal of Information Security Reserach ›› 2023, Vol. 9 ›› Issue (5): 440-.

Previous Articles     Next Articles

Research on Automatic Recognition Technology of Gambling Website

  

  • Online:2023-05-01 Published:2023-04-29

赌博网站自动识别技术研究

杨哲;陈应虎;   

  1. (国家计算机网络与信息安全管理中心云南分中心昆明650228)

Abstract: Online gambling has serious information security risks, and effective discovery and recognition of gambling websites is of great significance to maintaining national financial stability. Aiming at the difficulty of discovering gambling websites, the paper proposes a solution that obtains the IP network segment owned by the cloud platform based on the AS information, traverses the IP to reverse resolve the domain name, and distributes crawling to obtain website screenshots. Aiming at the problem that gambling websites are difficult to recognize, such as some gambling sites are just a picture with a link to download a gambling APP, the solution uses the dHash algorithm to clean the positive samples, and trains a convolutional neural network (CNN) for binary classification of websites. The experimental results show that the solution has a strong generalization ability and less human participation, and can solve the problems in the discovery and recognition of gambling websites.

Key words: online gambling, convolutional neural network, dHash algorithm, distributed crawler, cloud platform, automatic recognition

摘要: 网络赌博存在严重的信息安全风险,有效发现判定赌博网站对维护国家金融稳定有着十分重要的意义.针对赌博网站发现困难的问题,提出一种赌博网站自动识别技术方案,该方案根据云平台的AS信息获取其拥有的IP网段,遍历IP反向解析域名后进行分布式爬取以获得网站截图.针对赌博网站判定困难的问题(例如有的赌博网站仅是一幅包含赌博APP下载链接的图片),方案利用dHash算法对正样本进行清洗,训练卷积神经网络(CNN)进行网站二分类.实验结果表明方案泛化能力较强、人工参与少,可在一定程度上解决赌博网站发现、识别方面存在的问题.

关键词: 网络赌博, 卷积神经网络(CNN), dHash算法, 分布式爬虫, 云平台, 自动识别