具有关系敏感嵌入的知识库错误检测
2020年信息技术与网络安全第10期
缪 琦,杨昕悦
辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛125105
摘要: 准确性与质量对于知识库而言尤为重要,尽管已经有很多关于知识库不完整性的研究,但是很少有工作者考虑到对于知识库存在的错误进行检测,按照传统方法通常无法有效捕捉知识库中错误事实内在相关性。本文提出了一种知识库具有关系敏感嵌入式方法NSIL,以获取知识库各关系之间的相关性,从而检查出知识库中的错误,以此提高知识库的准确性与质量。该方法分为相关性处理和错误检测两阶段。在相关性处理阶段,使用NSIL的相关函数以分值形式获取各关系之间的相关度;在错误检测阶段,基于相关度分值进行错误检测,对于缺失主体或客体的三元组进行缺失成分预测。最后在知识库之一Freebase生成的基准数据集“FB15K”上进行了广泛验证,证明了该方法在知识库错误知识检测方面有着很高的性能。
中图分类号: TP183
文献标识码: A
DOI: 10.19358/j.issn.2096-5133.2020.10.005
引用格式: 缪琦,杨昕悦. 具有关系敏感嵌入的知识库错误检测[J].信息技术与网络安全,2020,39(10):23-27,37.
文献标识码: A
DOI: 10.19358/j.issn.2096-5133.2020.10.005
引用格式: 缪琦,杨昕悦. 具有关系敏感嵌入的知识库错误检测[J].信息技术与网络安全,2020,39(10):23-27,37.
Knowledge base error detection with relation sensitive embedding
Miao Qi,Yang Xinyue
School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125105,China
Abstract: Accuracy and quality are very important for the knowledge base. Although there have been many researches on the incompleteness of knowledge base, few workers consider the detection of errors in the knowledge base. According to the traditional methods, it is usually unable to effectively capture the internal correlation of errors in the knowledge base, so as to check the errors. In this paper, a relational sensitive embedded method NSIL for knowledge base is proposed to obtain the correlation among the relationships between them, so as to check out the errors in the knowledge base, so as to improve the accuracy and quality of the knowledge base. This method is divided into two stages: correlation processing and error detection. In the correlation processing stage, correlation function of NSIL is used to obtain the correlation degree of each relationship in the form of score; in the error detection stage, error detection is based on the score of correlation degree, and missing component prediction is carried out for the triplet of missing subject or object. At last, the method is verified on the benchmark data set "FB15K" which is generated by Freebase, one of the largest knowledge bases. It is proved that the method has high performance in knowledge base error detection.
Key words : knowledge base;embedding model;error detection
0 引言
如今,知识库已经成为各种研究和应用越来越重要的和常用的数据源,如语义搜索、实体链接、问答系统和自然语言处理等。为了使庞大数据库更易于操作,研究者提出了一种新的研究方向——知识库嵌入。关键思想是嵌入KB(Knowledge Base)组件,包括将实体和关系转化为连续的向量空间,从而简化操作,同时保留KB原有的结构。实体和关系嵌入能进一步应用于各种任务中,如KB补全、关系提取、实体分类和实体解析。虽然庞大的知识库中有数以亿计的事实,但是在信息爆炸的时代远远不够。大部分的研究工作聚焦知识库对缺失边的扩充,很少有人考虑到其中过时的、不正确的信息[1-3]。许多扩充知识库研究将事实投射到k维向量空间,通过聚类来找到关系的相关性,很难实现高效有效处理。
本文详细内容请下载:http://www.chinaaet.com/resource/share/2000003133
作者信息:
缪 琦,杨昕悦
(辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛125105)
此内容为AET网站原创,未经授权禁止转载。