[PDF] Modeling Global Semantics for Question Answering over Knowledge Bases

Abstract

Semantic parsing, as an important approach to question answering over knowledge bases (KBQA), transforms a question into the complete query graph for further generating the correct logical query. Existing semantic parsing approaches mainly focus on relations matching with paying less attention to the underlying internal structure of questions (e.g., the dependencies and relations between all entities in a question) to select the query graph. In this paper, we present a relational graph convolutional network (RGCN)-based model gRGCN for semantic parsing in KBQA. gRGCN extracts the global semantics of questions and their corresponding query graphs, including structure semantics via RGCN and relational semantics (label representation of relations between entities) via a hierarchical relation attention mechanism. Experiments evaluated on benchmarks show that our model outperforms off-the-shelf models.

Full PDF

MModeling Global Semantics for Question Answering over Knowledge Bases

Peiyun Wu , Yunjie Wu , Linjuan Wu , Xiaowang Zhang , Zhiyong FengCollege of Intelligence and Computing, Tianjin University, ChinaAbstract

Semantic parsing, as an important approachto question answering over knowledge bases(KBQA), transforms a question into the completequery graph for further generating the correct log-ical query. Existing semantic parsing approachesmainly focus on relations matching with payingless attention to the underlying internal structureof questions (e.g., the dependencies and relationsbetween all entities in a question) to select thequery graph. In this paper, we present a relationalgraph convolutional network (RGCN)-based modelgRGCN for semantic parsing in KBQA. gRGCNextracts the global semantics of questions and theircorresponding query graphs, including structure se-mantics via RGCN and relational semantics (la-bel representation of relations between entities) viaa hierarchical relation attention mechanism. Ex-periments evaluated on benchmarks show that ourmodel outperforms off-the-shelf models.

Semantic parsing [BC+13; YC+15] constructs a seman-tic parsing tree or equivalent query structure (called querygraph [LL+18]) that represents the semantics of questions.Semantic parsing based approaches effectively transformquestions into logical queries where the reliability of logicalquerying can ensure the correctness of answering questions.The success of semantics parsing lies in representing the se-mantics of questions in a syntactic way so that it can bettercapture the intention of users [HZZ18].Most existing semantic parsing approaches in question an-swering over knowledge bases (KBQA) focus on captur-ing the semantics of questions and query graphs. Some“common” relations occurring in both questions and querygraphs are taken as core relations for measuring similaritytogether with some manual features [YC+15; BD+16]. An-other approach to detecting individual relation is presentedin [YY+17] to improve the performance of matching ques-tions with query graphs where each relation is representedby integrating its word-level and relation-level representa-tions. [LL+18] extends [YY+17] by improving the repre-sentation of questions with answering more complex ques-tions. [SG18] enriches the semantics of all entities via all relations during the learning process by Gated Graph NeuralNetwork [LT+18] (GGNN).However, the state-of-the-art semantic parsing approachesutilize relational semantics of query graphs with pay little at-tention to the structure semantics of a question. The struc-ture semantics is an important part of the whole semanticsof questions (e.g., Figure 1), especially in complex questionswhere the complexity of a question often relies on its compli-cated structure. As a result, existing works only consider re-lational semantics cannot always perform complex questionsbetter. So it is necessary to pay more attention to the struc-ture semantics of questions together with relational semanticswhen semantic parsing in KBQA. However, to model multi-relational directed graphs with edge features remains an openproblem. Therefore, it is not trivial to combine the two se-mantics of a question.

Taylor Swift album?qMin

PUBLICATION DATE

Taylor

Swift album?q Min （ a ）（ b ） INSTANCE OFPERFORMER PERFORMERPUBLICATION DATEINSTANCE OF

Figure 1: (a) is the correct query graph which can ﬁnd the correctanswers in knowledge base, (b) is a wrong query graph with samerelations but different structure in (a).

In this paper, based on a RGCN [SK+18], we propose anovel model gRGCN for global semantic parsing of KBQA.In gRGCN, we apply RGCN to extract structure semantics ofquery graphs by utilizing its capability of learning the bettertask-speciﬁc embeddings of multi-relational graphs. gRGCNfocuses on extracting global semantics of both query graphsand relations. The main contributions are summarized as:• We propose a global semantic fusion method for struc-ture semantics of query graphs extracted in RGCN in-tegrated with relational semantics via enhanced ﬁne-grained relations ( word-level and relation-level repre-sentation) learning.• We introduce a hierarchical attention mechanism to rep-resent word-level relations, which can discover latentimplicit meaning and remove ambiguity in words.• We introduce a syntactic-sequence-combined represen-tation to encode questions and relation-attention-basedRGCN layer to strengthen structure semantics. a r X i v : . [ c s . A I] J a n The Overview of Our Model

In this section, we introduce our gRGCN model for globalsemantic parsing in KBQA, and the overview framework isshown in Figure 2.Our gRGCN model consists of three modules, namely, question representation encoder , structure semantics extrac-tor , and relational semantics extractor .• Question Representation Encoder transforms all ques-tions input into their syntactic dependency tree as in-put of RGCN and then encodes the representations ofall questions in RGCN with combining the sequence ofwords. The module is used to provide the condition ofcalculating the similarity of a question with its candidatequery graphs together with the fused semantics of struc-ture semantics and relational semantics.•

Structure Semantics Extractor embeds query graphsas a whole into a semantic space. The module is usedto mainly extract the semantics of the whole structureof query graphs. In this module, we present a relation-attention-based RGCN layer for extracting global se-mantics to be fused with relational semantics extractedin the last module.•

Relational Semantics Extractor embeds relations intoa semantic space. The module is used to mainly extractthe semantics of relations of query graphs left in theglobal semantics extractor. In this module, we presenta hierarchical relational attention mechanism to reducethe noise of word-level relations via

WordNet [M95].

In this section, we introduce our approach in the followings.

We convert a question to query graphs further transformedinto SPARQL queries for answering on KB. A query graph G is a directed graph with labeled nodes and typed directededges. All vertexes of G are divided into three categories:question variable ( ? q ), intermediate variables( v ), and KB en-tities. A ? q -node represents the answer to a question. A v -node is either an unknown entity or an unknown value. Anedge represents a relation between two vertexes.In this paper, we improve the candidate graph generationmethod based on an heuristic algorithm [SK+18] by furtherconsidering ﬁve kinds of semantic constraints: entity , type , temporal(explicit and inexplicit time) , order , and compare . This module encodes the syntactic structure of a question viaRGCN with combining with the sequence of all words occur-ring in the question to a ﬁnal vector as the representations ofthe question.Given a question q = ( w , . . . , w n ) , the syntactic depen-dency tree (a graph structure) of q , denoted as G q , is of form ( V q , E q ) where V q = { w , . . . , w n } (a vertex set) is a set ofwords and E q ⊆ V q × V q (a edge set) is a set of dependency-relation over { w , . . . , w n } . Edges are classiﬁed into twoclasses: self-loop and Head to dependent . Let W glove ∈ R | V |× d be the word embedding matrixwhere each row vector represents the embedding of a word, d is the dimension of embedding. For each vertex w i ∈ V q , −→ h w i is assigned to the corresponding vertex w i as its embed-ding in V q ( i = 1 , , . . . , n ).Firstly, we initialize the embedding of V q via its all ver-texes’ embedding and compute the average of word embed-dings as −→ E avg .Secondly, given the intial embedding of V q = {−→ h (0) w , ..., −→ h (0) w n } , we apply RGCN encoder to generate itshidden representation. In RGCN, the forward-pass update ofeach vertex w i in V q is formalized in the following: −→ h ( l +1) w i = ReLU  (cid:88) r ∈ E q (cid:88) w ∈ N rwi (cid:12)(cid:12) N rw i (cid:12)(cid:12) W r −→ h ( l ) w + W −→ h ( l ) w i  . Here l is a layer, N rw i is the set of all r -neighbors of w i in G q .Note that W and W v are weighted matrixes to be learned.To access both sequence and syntactic information, weconcatenate the word embedding of q and the output ofRGCN layers to calculate the ﬁnal result. The attentionweight of the j -th token (i.e., a j ) is calculated based on thefollowing formula: let M is a weight matrix, a i = softmax (cid:16) −→ E avg · M · [ −→ h (0) w i ⊕ −→ h ( l +1) w i ] (cid:17) , ∀ i ∈ n. Finally, we present a fully-connected layer and the ReLUnon-linearity to obtain the ﬁnal representation h q of question q in the following equation: −→ h q = ReLU (cid:16) W · (Σ ni =1 a i [ −→ h (0) w i ⊕ −→ h ( l +1) w i ]) + −→ b (cid:17) . This module computes the representation of the structure of aquery graph. We ﬁrst formalize query graphs.Let V e be a set of entites and V r be a set of relations. Aquery graph G on V e and V r is a quad of ( N G , R G , λ, δ ) where N G is the set of vertexes, R G is the set of edges, λ : N G → V e assigns each vertex to an entity, and δ : R G → V r assigns each edge to a relation.Firstly, we initialize hidden states −→ h (0) v i for node v i by cal-culating them in relation-attention-based RGCN layer in thefollowing way: −→ h ( l +1) v i = ReLU  (cid:88) r ∈ R G (cid:88) v ∈ N rvi ( a r W v · −→ h ( l ) v + W · −→ h ( l ) v i )  . Here l is a layer, N rv i is the set of all r -neighbors of v i in G .Note that W and W v are weighted matrixes to be learned. a r is the attention weight of the edge r to give high weight torelations more relevant to the question in query graphs : a r = softmax (cid:16) −→ E avg · −→ r i T (cid:17) , ∀ r i ∈ R G . Finally, we obtain −→ h structure as the structure semantic rep-resentation of G by taking the representation of answer node. uery graph + cos similarity Wordrepresentation

StructureSemantics

ExtractorRelation Semantics Extractor relationrelationwordnet ... ............... ...

Wordembedding What is Princess Leia ’s home planet?

QuestionRepresenatation Extractor

Weighted sum

RGCNLayer concatenation

Figure 2: The Overview of gRGCN. be subject to be subject + be subject to be subject + be subject to be subject + be subject to be subject + avg to → h E pooling M Word-levelfined-grained relationLemma layerSense layer

Figure 3: Local Semantics Extractor

To utilize relational semantics both questions and querygraphs, we compute the representation of all relations in word-level and relation-level .Firstly, we deﬁne a relational embedding matrix as follows:let d be a dimension, W rel ∈ R | V r |× d .Given a query graph G = ( N G , R G , λ, δ ) on V e and V r ,we deﬁne relation-level relations (e.g. “posistion held” ) in G , denoted by R whole , as a sequence of vectors as follows:let λ ( R G ) = { r , . . . , r n } , R whole1: n := {−→ r , . . . , −→ r n } .We deﬁne word-level relation R word , as a sequence ofvectors by applying W glove ∈ R | V |× d to map each word w ij in r i to its word embedding −→ w ij as follows: R word r i := {−→ w i , . . . , −→ w i m } , where { w i , . . . , w i m } is a set of all wordsoccurring in r i (e.g. “posistion” , “held” ).There is some unavoidable noise of ﬁne-grained relations.For instance, for relation type ” be subject to ,” its word-levelword may lose the global relation information, as ” subject ” may have different meanings shown in Figure 4.To reduce noise, we introduce an external linguistic knowl-edge WordNet [M95], where a word possibly has multiplesenses, and a sense consists of several lemmas.

Subject discipline subjugate submittopic issue matter

WordSenseLemma

Figure 4: The structure of “subject” in

Wordnet

For each word w ij in a relation r i , we use Sw ij = {−→ w ij , . . . , −→ w ijk } to denote all sense vectors in w ij and Lw ijk = {−→ w ijk , . . . , −→ w ijkz } to denote all lemma vectors insense w ijk . The vector of sense −→ w ijk is deﬁned as follows: −→ w ijk = z (cid:88) y =1 a ijky −→ w ijky , where a ijky = softmax (cid:0) tanh (cid:0) −−→ avg q · −→ w Tijky (cid:1)(cid:1) , ∀ y ∈ (1 , . . . , z ) The vector of word w ij is deﬁned as follows: −→ w ij = k (cid:88) c =1 a ijc −−→ w ijc , where , ∀ c ∈ (1 , . . . , k ) a ijc = softmax (cid:16) W · (cid:0) tanh (cid:0) −−→ avg q · −→ w Tijc (cid:1)(cid:1) + −→ b (cid:17) Then, we obtain the ﬁne-grained relations embedding of G , denoted by R ﬁne . Based on ﬁne-grained word embeddingof a relation r i , we deﬁne a vector as follows: R ﬁne G [ i ] = 1 m m (cid:88) j R WORD r i [ j ]+ R WHOLE [ i ] , ∀ i ∈ (1 , . . . , n ) . We employ max pooling over the R ﬁne to get the ﬁnal re-lational semantics −→ h relational .Finally, we present a linear combination to fuse structureand relational semantics via ReLU as the global semantics: −→ h whole = ReLU( W T ( −→ h relational + −→ h structure ) + −→ b ) . Experiments and Evaluation

In this section, we evaluate our approach on the following ex-periments: an overall contrast experiment with baselines andablation study experiments of our three different modules.

Knowledge Bases

We select two representive KBs:•

Wikidata : A collaborative KB developed by [VK14]contains more than 40 million entities and 350 millionrelation instances. We use the full Wikidata dump andhost it with Virtuoso engine .• FB2M : The KB collected by [BU+15], as a subset ofFreebase, consists of 2 million entities and 10 milliontriple facts. FB2M is a famous KB for creating manyQA datasets particularly simple question datasets.

QA Datasets

We select four popular datasets as follows:•

ComplexQuestion (CompQ): The dataset consisting of2,100 complex questions is developed by [BD+16] toprovide questions with structure and expression diver-sity. To support answering over Wikidata [VK14], weslightly revise it by mapping answers to Wikidata an-swer IDs instead of Freebase IDs since Freebase wasdiscontinued and is no longer up-to-date, including un-availability of APIs and new dumps [SG18].•

WebQSP-WD (WebQSPS): The dataset collected by[SG18] contains 712 real complex questions (in total3913). WebQSPS is a corrected version of the com-mon benchmark WebQSP[BC+13] data set for support-ing Wikidata.•

QALD-7 : The small dataset consisting of 42 complexquestions and 58 simple questions collected by [UN+17]to be mainly used to test KBQA models. QALD-7 isspecially designed for the KBQA task over Wikidata.•

SimpleQuestion (SimpQ): The dataset developed by[BU+15], as a popular benchmark of KBQA, containsover 100K questions w.r.t. FB2M. To evaluate the gen-eralization ability of our model, we additionally useSimpQ in our complementary experiment.

Implementation Details

In training, we adopt hinge loss which are applied in [LL+18; SG18]. Formally, let q be aquestion and C be the query graph set of q where C containsall positive graph g + and negative graph g − of q . L = max (cid:88) g ∈ C (0 , ( λ − cos( (cid:126)q, (cid:126)g + ) + cos( (cid:126)q, (cid:126)g − ))) . In our experiments, We use S-MART [YC15] as our en-tity linking tool, GloVe [PSM14] word vectors with dimen-sions of 100 is employed to initializing word embeddingsand Adam optimizer [KB15] is applied to train the modelwith a batch size of 64 at each epoch. Moreover, we de-sign a multiple-layer RGCN with a fully-connected layer withdropout=0.2 to calculate the representation of questions andthe structure of query graphs (three-layer for questions and http://virtuoso.openlinksw.com/ two layers for graphs). Finally, SpaCy is used to parse syn-tactic dependency, and Deep Graph Library (DGL ) is usedto transfer query graphs into DGL graph objects. We introduce four popular models and one variation ofgRGCN as baselines as follows:• STAGG (2015) [YC+15]: The model scores querygraphs by some manual features and the representationof the core relation. It’s a re-implemented by [SG18].• Yu et al. (2017)[YY+17]: The model encodes questionsthat are encoded by residual BiLSTM and compute sim-ilarity with the pooling of ﬁned-grained relations. Were-implement it by adding a graph generating process inorder to support complex questions.• Luo et al. (2018)[LL+18]: The model represents the cor-responding multi-relational semantics of complex ques-tions.• GGNN (2018) [SG18]: The model encodes questionsby Deep Convolutional Neural Network (DCNN), andquery graphs are encoded based on Gated Graph NeuralNetwork (GGNN).• gGCN (our): The model is obtained from gRGCNby replacing RGCN with Graph Convolutional Net-work(GCN) [KW17] where GCN ignores the relationaltype .The four models as our baselines are representative in var-ious mechanisms: STAGG (2015) is based manual featuresin characterizing the structure of query graphs; Yu et al.(2017) is based on the max-pooling of ﬁned-grained rela-tional semantics; Luo et al. (2018) is based on the sum ﬁned-grained relational semantics; GGNN (2018) enriches the se-mantics of each entity by the average of relations seman-tics. Besides STAGG(2015), there are many works mainlyon constructing query graphs over Freebase, such as [BD+16;HZZ18]. Since their codes are not accessible, we will discussthem in the related works.

For evaluating our approach on complex questions, we se-lect Wikidata as KB, CompQ, WebQSPS, and QALD-7 asdatasets and precision, recall, and F1-score as metrics.The experimental results are shown in Table 1, where wetake the average results of all questions in a dataset as the ﬁnalresult. Note that our results are different from those originalresults released in baselines over Freebase.By Table 1, we show that gRGCN outperforms all datasetsand all metrics.• gRGCN achieves . , . higher F1-score com-pared to STAGG and GGNN on WebQSPS; . , . higher F1-score compared to STAGG and GGNNin QALD-7 dataset; . , . higher F1-score com-pared to STAGG and GGNN on CompQ. So we can con-clude that the global semantics performs better than allbaselines without considering the global semantics.• The ﬁve models: Yu et al. (2017), Luo et al. (2018),GGNN (2018), gGCN (our), and gRGCN (our) achieves https://spacy.io/. Model WebQSPS QALD-7 CompQ

Precision Recall F1 Precision Recall F1 Precision Recall F1STAGG(2015) 0.1911 0.2267 0.1828 0.1934 0.2463 0.1861 0.1155 0.1481 0.1108Yu et al.(2017) 0.2144 0.2548 0.2006 0.1972 0.2103 0.1923 0.1297 0.1675 0.1291Luo et al.(2018) 0.2374 0.2587 0.2252 0.2117 0.2438 0.2016 0.1331 0.2118 0.1317GGNN(2018) 0.2686 0.3179 0.2588 0.2176 0.2751 0.2131 0.1297 0.1481 0.1285gGCN (our) 0.2713 0.3291 0.2631 0.2334 0.3109 0.2437 0.1267 0.2244 0.1441gRGCN (our) . , . higher F1-score compared to STAGG(2015) on the three datasets. In short, we can show thatend-to-end neural network frameworks perform betterthan models with manual features.• gRGCN achieves . , . , . higher F1-scorecompared to gGCN on the three datasets. Hence, wecan conclude that relational types as a part of a relationalstructure can also improve the performance of KBQA.Therefore, the experiments show that our approach can ex-tract the global semantics of complex questions, which ishelpful in improving the performance of KBQA. Simple questions still contain a little structural information,such as the linear order of the subject entity and the objectentity. We complementary evaluate our approach to simplequestions to analyze its robustness. This experiment selectsFB2M as KB, SimpQ as dataset, and six popular baselines.By the experimental results shown in Table 2, our approach

Table 2: Results on Simple Question

SimpQ Accuracy

Yin et al.(2016) 0.683Bao et al.(2016) 0.728Lukovnikov et al.(2017) 0.712Luo et al.(2018) 0.721Mohammed et al.(2018) 0.732Huang et al.(2019) slightly better our gRGCN .Indeed, Huang et al. (2019) recovers the question’s head en-tity, predicate, and tail entity representations in the KG em-bedding spaces. In this sense, Huang et al. (2019) consider astructure-like semantics to improve the performance of sim-ple question. In other words, Huang et al. (2019) also veriﬁesthe effectiveness of our idea. In this subsection, we will pay attention to analyze the ef-fectiveness of the proposed question representation and theproposed query graph representation in our model. For the ablation study, we use the F1 score as our metrics and per-form experiments on CompQ, WebQSPS, and QALD-7 w.r.t.complex questions.

Question Representation

To analyze the effectiveness ofour proposed question representation encoder, this exper-iment compares gRGCN to gRGCN with substituting thequestion encoder in each baseline for our encoder.The results are shown in Table 3, where the column ofQR of Baselines is gRGCN equipping with the correspondingquestion encoder and gRGCN − denote a variant of gRGCNwithout concatenating question sequences. We use gRGCN − to analyze the effectiveness of our proposed question encoderfurther, only considering those information of questions alsoconsidered in baselines fairly. Table 3: Ablation Results on Question Representation

QR of Baselines WebQSP QALD CompQYu et al.(2017) 0.2784 0.2464 0.1400DCNN(STAGG,GGNN) 0.2810 0.2511 0.1432Luo et al.(2018) 0.2844 0.2683 0.1483gRGCN − (our) gRGCN (our) By Table 3, we can show that our proposed question rep-resentation outperforms question representations of all base-lines over all datasets.• Firstly, our question encoder performs best compared tothe other models. Speciﬁcally, gRGCN betters Luo et al.(2018), with a . , . , and . improvement onthree datasets.• Secondly, our question encoder performs best com-pared to the other models without concatenating ques-tion sequences. Speciﬁcally, gRGCN − betters Luo et al.(2018) with a . , . , and . improvement onthree datasets.• Thirdly, the question sequence is useful to improvethe performance question presentation. Speciﬁcally,gRGCN betters gRGCN − with a . , . , and . improvement on three datasets. Graph Representation

To analyze the effectiveness of ourproposed graph representation encoder, we consider twotypes of relations: ﬁne-grained relation (f-g) and wordnet,two types of structures: attention and non-attention (non-att).Note that gRGCN considers all relations and structures.he experimental results on the three datasets are shown inTable 4, where there are 18 cases, including 6 cases withoutstructure.

Table 4: Ablation Results on Graph Representation

Relation Structure WebQSP QALD CompQf-g - 0.2101 0.2081 0.1320f-g non-att 0.2614 0.2472 0.1456f-g attention 0.2810 0.2690 0.1471f-g/wordnet - 0.2304 0.2501 0.1368f-g/wordnet non-att 0.2750 0.2711 0.1483f-g/wordnet attention 0.2910 0.3024 0.1519By Table 4, we can show that our proposed graph represen-tation outperforms graph representations of all baselines overall datasets.• Firstly, the structure of query graphs is veriﬁed toimprove the performance of graph presentation with-out considering structure. Speciﬁcally, non-attentionachieve . , . , . improvement on threedatasets, and even when further considering wordnet-relation, it still performs with a . , . , . im-provement.• Secondly, the wordnet-relation of query graphs is ver-iﬁed to improve the performance graph presentationwithout considering the wordnet-relation. Speciﬁcally,wordnet-relation performs with a . , . , . improvement on three datasets, and even when furtherconsidering attention-structure, it still performs with a . , . , . improvement.• Thirdly, the attention-structure of query graphs is ver-iﬁed to improve the performance graph presentationwithout considering the attention-structure. Speciﬁ-cally, attention-structure performs with a . , . , . improvement on three datasets, and even whenfurther considering wordnet-relation, it still performswith a . , . , . improvement.Therefore, we can conclude that the structure presentedin our graph representation as a global semantics is usefulto improve the performance question presentation as well aswordnet-relation. We randomly analyze the major causes of errors on 100 ques-tions, which returned incorrect answers.

Semantic ambiguity and complexity (42%) : Due to thelimitation of semantic graph generation, we can’t generategraph space for some question since their complexity and se-mantic ambiguity.

Correct graphs scored low (16%) : This type of error oc-curred when the model failed to extract the global semanticsof questions and graphs. Although the correct graphs existedin the candidate set, we failed to select it.

Entity linking error (9%) : This error occurred due tothe failure to extract the appropriate entities from the givenquestion. And subsequently, we generate incorrect semanticgraphs. We could correct questions by replacing the wrongentities.

Dateset and KB error (33%) : This error occurred sincethe incompatibility between wikidata and Freebase ID indatasets or the defects of datasets. The test datasets containmany open question whose answers may be partially or com-pletely unstored in Wikidata.

The most popular approaches proposed for the KBQA taskcan be roughly classiﬁed into two categories: semantic pars-ing and information retrieval . Semantic Parsing

Semantic parsing (SP) based approachesuse a formal representation to encode the semantic infor-mation of questions for obtaining answers on KB [BC+13;RT+16; ZC+19]. The traditional SP-based methods [KZ+15;KM12; CY13] mainly rely on schema matching or hand-crafted rules and features.Recently, neural networks (NN) based approaches haveshown great performance on KBQA. Different from [CX+17]and [AY+17] tried to extract questions intention by a largenumber of templates automatically learned from KB and QAcorpora, SP+NN employed neural network to encode the se-mantics of questions and query graphs, and then select thecorrect query graphs of given questions in the candidate setfor querying later [BD+16; LL+18; HZZ18]. [LL+18] en-coded multiple relations into a uniform query graph to capturethe ﬁned-grained relations. [SG18] used GGNN to enrichthe semantics of all entities with tackling the limited complexquestions. Differently, our approach integrates both structureand enhanced relational semantics of query graphs.[HZZ18] [DH+19] pay more attention to query graph gen-eration and use an SVM rank classiﬁer and a combinationalfunction to rank candidate graphs respectively, which is notan encode-and-compare framework as our model. [CSH18]uses an end-to-end query graph generation process via anRNN model, and it aims to search the best action sequence.However, we are focus on learning global semantics of thequery graph.

Information Retrieval

Information retrieval (IR) based ap-proaches retrieve the candidate entities (as answers) fromKB using semantic relation extraction [XR+16; DW+15;WH+19; YR+16; SD+18; LWJ19].Those approaches mapanswers and questions into the same embedding space, wherewe could retrieve the answers independent of any grammar,lexicon, or other semantic structure. Our approach belongsto the SP-based method and encoded semantic information ofquestion into a query graph for querying on KB.

In this paper, we propose an RGCN-based model for se-mantic parsing of KBQA with extracting global semanticsto maximize the performance of question learning. Our ap-proach pays more attention to structure semantics of querygraphs extracted by RGCN and enhanced relational seman-tics in KBQA. In this sense, our approach provides a newway to capture the intention of questions better. In this paper,we generalize our model to support complex questions, suchas comparing questions. In future work, we are interested inextending our model for more complex practical questions. eferences [AY+17] Abujabal A., Yahya M., Riedewald M., Weikum G.(2017). Automated template generation for question an-swering over knowledge graphs.

WWW : 1191–1200.[BU+15] Bordes A., Usunier N., Chopra S., Weston J.(2015). Large-scale simple question answering withmemory networks. arXiv:1506.02075 .[BD+16] Bao J., Duan N., Yan Z., Zhou M., Zhao T. (2016).Constraint-based question answering with knowledgegraph.

COLING : 2503–2514.[BC+13] Berant J., Chou A., Frostig R., Liang P. (2013). Se-mantic parsing on freebase from question-answer pairs.

EMNLP : 1533–1544.[CY13] Cai Q., Yates A. (2013). Large-scale semanticparsing via schema matching and lexicon extension.

ACL : 423–433.[CSH18] Chen B., Sun L., Han X. (2018). Sequence-to-Action: end-to-end semantic graph generation for se-mantic parsing.

ACL : 766–777.[CX+17] Cui W., Xiao Y., Wang H., Song Y., Hwang S.,Wang W. (2017). KBQA: Learning question answer-ing over QA corpora and knowledge bases.

PVLDB ,10(5): 565–576.[DW+15] Dong L., Wei F., Zhou M., Xu K. (2015). Ques-tion answering over freebase with multi-column convo-lutional neural networks.

ACL : 260–269.[HZ+19] Huang X., Zhang J., Li D., Li P. (2019).Knowledge graph embedding bsed question answering.

WSDM : 105–113.[HZZ18] Hu S., Zou L., Zhang X.(2018). A State-transitionframework to answer complex questions over knowl-edge base.

EMNLP : 2098–2108.[DH+19] Ding J., Hu W., Xu Q., Qu Y.(2019). Leveragingfrequent query substructures to generate formal queriesfor complex question answering.

EMNLP : 2614–2622.[KW17] Kipf T. N., Welling M. (2017). Semi-supervisedclassiﬁcation with graph convolutional networks.

ICLR .[KB15] Kingma D. P., Ba J. (2015). Adam: A Method forstochastic optimization.

ICLR .[KZ+15] Kwiatkowski T., Zettlemoyer L. S., Goldwa-ter S., Steedman M. (2015). Lexical generaliza-tion in CCG grammar induction for semantic parsing.

EMNLP : 1512–1523.[KM12] Krishnamurthy J., Mitchell T. M. (2012). Weaklysupervised training of semantic parsers.

EMNLP : 754–765.[LWJ19] Lan Y., Wang S., Jiang J. (2019). Knowledge basequestion answering with topic units.

IJCAI : 5046–5052.[LT+18] Li Y., Tarlow D., Brockschmidt M., Zemel R. S.(2018). Gated graph sequence neural networks.

ACL : 273–283. [LL+18] Luo K., Lin F., Luo X., Zhu K. Q. (2018). Knowl-edge base question answering via encoding of complexquery graphs.

EMNLP : 2185–2194.[M95] Miller G. A. (1995). Wordnet: a lexical database forenglish.

Commun. ACM , 38(11): pp.39–41.[PSM14] Pennington J., Socher R., Manning C. D. (2014).Glove: Global vectors for word representation.

EMNLP : 1532–1543.[RLS14] Reddy S., Lapata M., Steedman M. (2014). Large-scale semantic parsing without question-answer pairs.

TACL , 2:377–392.[RT+16] Reddy S., T¨ackstr¨om O., Collins M., KwiatkowskiT., Das D., Steedman M., Lapata M. (2016). Transform-ing dependency structures to logical forms for semanticparsing.

TACL , 4:127–140.[SG18] Sorokin D., Gurevych I. (2018). Modeling semanticswith gated graph neural networks for knowledge basequestion answering.

COLING : 3306–3317.[SD+18] Sun H., Dhingra B., Zaheer M., Mazaitis K.,Salakhutdinov R., Cohen W. W. (2018). Open-domainquestion answering using early fusion of knowledgebases and text.

EMNLP : 4231–4242.[SK+18] Schlichtkrull M. S., Kipf T. N., Bloem P., BergR.,Titov I., Welling M. (2018). Modeling relational datawith graph convolutional networks.

ESWC : 593–607.[UN+17] Usbeck R., Ngomo A. N., Haarmann B., KritharaA., R¨oder M., Napolitano G. (2017) 7th Open Challengeon Question Answering over Linked Data (QALD-7).

Semantic Web Challenges : 59–69.[VK14] Vrandecic D., Kr¨otzsch M. (2014). Wikidata: A freecollaborative knowledge base.

CACM , 57(10): 78–85.[WH+19] Wu P., Huang S., Weng R., Zheng Z., Zhang J.,Yan X., Chen J. (2019). Learning representation map-ping for relation detection in knowledge base questionanswering.

ACL : 6130–6139.[XR+16] Xu K., Reddy S., Feng Y., Huang S., Zhao D.(2016). Question answering on freebase via relation ex-traction and textual evidence.

ACL : 2326–2336.[YC15] Yang Y., Chang M. (2015). S-MART: Novel tree-based structured learning algorithms applied to tweetentity linking.

ACL : 504–513.[YC+15] Yih W., Chang M., He X., Gao J. (2015). Seman-tic parsing via staged query graph generation: questionanswering with knowledge base.

ACL : 1321–1331.[YR+16] Yih W., Richardson M., Meek C., Chang M., Suh J.(2016). The value of semantic parse labeling for knowl-edge base question answering.

ACL : 201–206.[YY+17] Yu M., Yin W., Hasan K.S., Santos C. N., XiangB., Zhou B. (2017). Improved neural relation detectionfor knowledge base question answering.

ACL : 571–581.[ZC+19] Zhang H., Cai J., Xu J., Wang J. (2019). Com-plex question decomposition for semantic parsing.