您所在的位置:首页 > 其他 > 设计应用 > 一种不良域名快速核验方法的研究
尚秋明,王利军,邓桂英,赵 彤,张立坤
中国互联网络信息中心 技术研发部,北京100190
摘要: 针对大量网络赌博、淫秽色情等不良域名网页内容存在高度相似性,且运营者多采用注册大量域名部署同一套网站代码的方式,变相规避域名被封等特征,利用图像相似性聚类和相似性搜索等技术,提出一种不良域名的快速核验方法。实验表明,人工抽样一万个不良域名样本(淫秽色情和网络赌博域名各5 000个)进行判定,该不良域名核验方法总体准确率为99.67%,淫秽色情类准确率为99.66%,网络赌博类准确率为99.68%,大幅提升了不良域名人工审核效率。
中图分类号: TN91
文献标识码: A
中文引用格式: 尚秋明,王利军,邓桂英,等. 一种不良域名快速核验方法的研究[J].电子技术应用,2022,48(10):72-77.
英文引用格式: Shang Qiuming,Wang Lijun,Deng Guiying,et al. Research on a fast verification method for malicious domain names[J]. Application of Electronic Technique,2022,48(10):72-77.
Research on a fast verification method for malicious domain names
Shang Qiuming,Wang Lijun,Deng Guiying,Zhao Tong,Zhang Likun
Technological Research and Development Department,China Internet Network Information Center(CNNIC),Beijing 100190,China
Abstract: As the high similarity exists in the web content of the malicious domain names, such as online gambling, pornographic etc., and the operators register a large number of domain names and deploy the same website code to circumvent domain name blocking, this paper proposes a fast verification method for malicious domain names by using image similarity clustering and similarity search. Ten thousand malicious domain name samples are selected manually in the experiment,including 5 000 pornography and 5 000 Internet gambling domain names. The final experiment shows that the overall accuracy of the verification method is 99.67%, 99.66% for pornography and 99.68% for Internet gambling, which greatly improves the manual verification efficiency of malicious domain names.
Key words : domain names;malicious domain names;malicious information monitoring;similarity search;clustering analysis

0 引言


    现有不良域名的检测识别多是基于域名相关信息,包括注册信息、DNS解析服务器、网站IP归属地等,结合不良域名黑白名单,利用机器学习预测模型,实现对域名不良程度进行判定。该方法的前提是不良域名之间存在若干相关性。由于域名的注册成本较低且可选注册的顶级域名类型超过1 000个,借助于大量的域名托管服务商和云服务商,域名注册者可通过打破不良域名之间关联关系,实现逃避此类检测算法的目的。同时该方法的域名不良判定结果仍需大量的人工检验工作,以便开展相关处置工作。



尚秋明,王利军,邓桂英,赵  彤,张立坤

(中国互联网络信息中心 技术研发部,北京100190)

