摘要: |
提出一种基于语义核函数的问题分类算法,该算法基于问题的语法结构构建支持向量机(SVM)核函数.首先,将给定的问题解析为语法树结构,用语法树的子树表示该问题;然后,从词法、语法、语义三个层面提取问题的特征,构成更加丰富的特征空间;接着,基于问题的语法树构建核函数;最后,使用潜在语义索引方法并结合问题的词法、语法以及语义特征,通过语义核函数将特征空间映射到更有效的空间中进行问题分类.TREC数据集上的实验结果表明,通过词法、语法以及语义增强的问题特征空间可以提高分类准确率. |
关键词: 问答系统 监督学习 支持向量机 问题分类 语义核函数 特征空间 |
DOI:10.3969/J.ISSN.1000-5137.2018.01.009 |
分类号: |
基金项目:国家自然科学基金(61572326,61702333);上海市教育科学规划项目(C160049);上海市科委地方院校能力建设项目(17070502800) |
|
A semantic kernel function based question classification algorithm in question answering system |
Jiang Longquan1, Zhang Bo1, Hu Zhipeng1, Ding Junhong2, Liu Bo2
|
1.The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 200234, China;2.Shanghai Super Computing Technology Co. Ltd., Shanghai 201203, China
|
Abstract: |
A question classification algorithm based on semantic kernel function is proposed.This algorithm constructs Support Vector Machine (SVM) kernel function based on the grammatical structure of the question.Firstly,the given question is parsed into syntactical structural tree,and then sub-trees of syntactical tree are used to represent the question.Secondly,features are extracted from three aspects of the question:lexical,syntactical and semantic,to form a richer feature space.Thirdly,the kernel function is constructed based on syntactical structural tree of the question.Finally,using the potential semantic indexing method and the lexical,grammatical and semantic features of the question,the feature space is mapped into a more efficient space by the semantic kernel.The experimental results on the TREC dataset show that the classification accuracy can be improved by lexical,grammatical,and semantic enhancement. |
Key words: question answering supervised learning SVM question classification semantic kernel function feature space |