Jori Mur | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jori Mur is active.

Explore More

Publication

Featured researches published by Jori Mur.

international conference on computational linguistics | 2004

Information extraction for question answering: improving recall through syntactic patterns

Valentin Jijkoun; Maarten de Rijke; Jori Mur

We investigate the impact of the precision/recall trade-off of information extraction on the performance of an offline corpus-based question answering (QA) system. One of our findings is that, because of the robust final answer selection mechanism of the QA system, recall is more important. We show that the recall of the extraction component can be improved using syntactic parsing instead of more common surface text patterns, substantially increasing the number of factoid questions answered by the QA system.

cross language evaluation forum | 2005

Question answering for dutch using dependency relations

Gosse Bouma; Jori Mur; Gertjan van Noord; Lonneke van der Plas; Jörg Tiedemann

Joost is a question answering system for Dutch which makes extensive use of dependency relations. It answers questions either by table look-up, or by searching for answers in paragraphs returned by IR. Syntactic similarity is used to identify and rank potential answers. Tables were constructed by mining the CLEF corpus, which has been syntactically analyzed in full.

international conference on computational linguistics | 2008

Simple is Best: Experiments with Different Document Segmentation Strategies for Passage Retrieval

Jörg Tiedemann; Jori Mur

Passage retrieval is used in QA to filter large document collections in order to find text units relevant for answering given questions. In our QA system we apply standard IR techniques and index-time passaging in the retrieval component. In this paper we investigate several ways of dividing documents into passages. In particular we look at semantically motivated approaches (using coreference chains and discourse clues) compared with simple window-based techniques. We evaluate retrieval performance and the overall QA performance in order to study the impact of the different segmentation approaches. From our experiments we can conclude that the simple techniques using fixed-sized windows clearly outperform the semantically motivated approaches, which indicates that uniformity in size seems to be more important than semantic coherence in our setup.

cross language evaluation forum | 2008

Question Answering with Joost at CLEF 2007

Gosse Bouma; G. Kloosterman; Jori Mur; Gertjan van Noord; Lonneke van der Plas; Jörg Tiedemann

We describe our system for the monolingual Dutch and multilingual English to Dutch QA tasks. We describe the preprocessing of Wikipedia, inclusion of query expansion in IR, anaphora resolution in follow-up questions, and a question classification module for the multilingual task. Our best runs achieved 25.5% accuracy for the Dutch monolingual task, and 13.5% accuracy for the multilingual task.

Archive | 2010

Automatic acquisition of lexico-semantic knowledge for question answering

Lonneke van der Plas; Gosse Bouma; Jori Mur; Chu-Ren Huang; Nicoletta Calzolari; Aldo Gangemi; Alessandro Lenci; Alessandro Oltramari; Laurent Prévot

Lexico-semantic knowledge is becoming increasingly important within the area of natural language processing, especially for applications, such as Word Sense Disambiguation, Information Extraction and Question Answering (QA). Although the coverage of handmade resources, such as WordNet (Fellbaum, 1998), in general is impressive, coverage problems still exist for those applications involving specific domains or languages other than English. We are interested in using lexico-semantic knowledge in an open-domain question answering system for Dutch. Obtaining such knowledge from existing resources is possible, but only to a certain extent. The most important resource for our research is the Dutch portion of EuroWord-Net (Vossen, 1998), however its size is only half of that of the English WordNet. Therefore, many of the lexical items used in the QA task of the Cross Language Evaluation Forum (CLEF 1) for Dutch cannot be found in EuroWordNet. In addition, information regarding the classes to which named entities belong, e.g. Narvik IS-A harbour, has been shown to be useful for QA, but such information is typically absent from hand-built resources. For these reasons, we are interested in investigating methods which acquire lexico-semantic knowledge automatically from text corpora.

Theory and Applications of Natural Language Processing | 2011

Relation Extraction for Open and Closed Domain Question Answering

Gosse Bouma; I. Fahmi; Jori Mur

One of the most accurate methods in Question Answering (QA) uses off-line information extraction to find answers for frequently asked questions. It requires automatic extraction from text of all relation instances for relations that users frequently ask for. In this chapter, two methods are presented for learning relation instances for relations relevant in a closed and open domain (medical) QA system. Both methods try to learn automatic dependency paths that typically connect two arguments of a given relation. The first (lightly supervised) method starts from a seed list of argument instances, and extracts dependency paths from all sentences in which a seed pair occurs. This method works well for large text collections and for seeds which are easily identified, such as named entities, and is well-suited for open domain QA. A second experiment concentrates on medical relation extraction for the question answering module of the IMIX system. The IMIX corpus is relatively small and relation instances may contain complex noun phrases that do not occur frequently in the exact same form in the corpus. In this case, learning from annotated data is necessary. Dependency patterns enriched with semantic concept labels are shown to give accurate results for relations that are relevant for a medical QA system. Both methods improve the performance of the Dutch QA system Joost.

cross language evaluation forum | 2006

Using syntactic knowledge for QA

Gosse Bouma; I. Fahmi; Jori Mur; Gertjan van Noord; Lonneke van der Plas; Jörg Tiedemann

We describe the system of the University of Groningen for the monolingual Dutch and multilingual English to Dutch QA tasks. First, we give a brief outline of the architecture of our QA-system, which makes heavy use of syntactic information. Next, we describe the modules that were improved or developed especially for the CLEF tasks, among others incorporation of syntactic knowledge in IR, incorporation of lexical equivalences and coreference resolution, and a baseline multilingual (English to Dutch) QA system, which uses a combination of Systran and Wikipedia (for term recognition and translation) for question translation. For non-list questions, 31% (20%) of the highest ranked answers returned by the monolingual (multilingual) system were correct.

Traitement Automatique des Langues (TAL) | 2005