Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sergio Ferrández is active.

Publication


Featured researches published by Sergio Ferrández.


Journal of Web Semantics | 2011

The QALL-ME Framework: A specifiable-domain multilingual Question Answering architecture

Óscar Ferrández; Christian Spurk; Milen Kouylekov; Iustin Dornescu; Sergio Ferrández; Matteo Negri; Rubén Izquierdo; David Tomás; Constantin Orasan; Guenter Neumann; Bernardo Magnini; José L. Vicedo

This paper presents the QALL-ME Framework, a reusable architecture for building multi- and cross-lingual Question Answering (QA) systems working on structured data modelled by an ontology. It is released as free open source software with a set of demo components and extensive documentation, which makes it easy to use and adapt. The main characteristics of the QALL-ME Framework are: (i) its domain portability, achieved by an ontology modelling the target domain; (ii) the context awareness regarding space and time of the question; (iii) the use of textual entailment engines as the core of the question interpretation; and (iv) an architecture based on Service Oriented Architecture (SOA), which is realized using interchangeable web services for the framework components. Furthermore, we present a running example to clarify how the framework processes questions as well as a case study that shows a QA application built as an instantiation of the QALL-ME Framework for cinema/movie events in the tourism domain.


Information Sciences | 2009

Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering

Sergio Ferrández; Antonio Toral; íscar Ferrández; Antonio Ferrández; Rafael Muñoz

This paper describes a new advance in solving Cross-Lingual Question Answering (CL-QA) tasks. It is built on three main pillars: (i) the use of several multilingual knowledge resources to reference words between languages (the Inter Lingual Index (ILI) module of EuroWordNet and the multilingual knowledge encoded in Wikipedia); (ii) the consideration of more than only one translation per word in order to search candidate answers; and (iii) the analysis of the question in the original language without any translation process. This novel approach overcomes the errors caused by the common use of Machine Translation (MT) services by CL-QA systems. We also expose some studies and experiments that justify the importance of analyzing whether a Named Entity should be translated or not. Experimental results in bilingual scenarios show that our approach performs better than an MT based CL-QA approach achieving an average improvement of 36.7%.


cross language evaluation forum | 2005

AliQAn, spanish QA system at CLEF-2005

Sandra Roger; Sergio Ferrández; Antonio Ferrández; Jesús Peral; Fernando Llopis; Antonia Aguilar; David Tomás

Question Answering is a major research topic at the University of Alicante. For this reason, this year two groups participated in the QA@CLEF track using different approaches. In this paper we describe the work of Alicante 2 group. This paper describes AliQAn, a monolingual open-domain Question Answering (QA) System developed in the Department of Language Processing and Information Systems at the University of Alicante for CLEF-2005 Spanish monolingual QA evaluation task. Our approach is based fundamentally on the use of syntactic pattern recognition in order to identify possible answers. Besides this, Word Sense Disambiguation (WSD) is applied to improve the system. The results achieved (overall accuracy of 33%) are shown and discussed in the paper.


applications of natural language to data bases | 2007

Applying wikipedia's multilingual knowledge to cross-lingual question answering

Sergio Ferrández; Antonio Toral; Óscar Ferrández; Antonio Ferrández; Rafael Muñoz

The application of the multilingual knowledge encoded in Wikipedia to an open-domain Cross-Lingual Question Answering system based on the Inter Lingual Index (ILI) module of EuroWordNet is proposed and evaluated. This strategy overcomes the problems due to ILIs low coverage on proper nouns (Named Entities). Moreover, as these are open class words (highly changing), using a community-based up-to-date resource avoids the tedious maintenance of hand-coded bilingual dictionaries. A study reveals the importance to translate Named Entities in CL-QA and the advantages of relying on Wikipedia over ILI for doing this. Tests on questions from the Cross-Language Evaluation Forum (CLEF) justify our approach (20% of these are correctly answered thanks to Wikipedias Multilingual Knowledge).


Information Sciences | 2008

Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization

Estela Saquete; Óscar Ferrández; Sergio Ferrández; Patricio Martínez-Barco; Rafael Muñoz

This paper presents an improvement in the temporal expression (TE) recognition phase of a knowledge based system at a multilingual level. For this purpose, the combination of different approaches applied to the recognition of temporal expressions are studied. In this work, for the recognition task, a knowledge based system that recognizes temporal expressions and had been automatically extended to other languages (TERSEO system) was combined with a system that recognizes temporal expressions using machine learning techniques. In particular, two different techniques were applied: maximum entropy model (ME) and hidden Markov model (HMM), using two different types of tagging of the training corpus: (1) BIO model tagging of literal temporal expressions and (2) BIO model tagging of simple patterns of temporal expressions. Each system was first evaluated independently and then combined in order to: (a) analyze if the combination gives better results without increasing the number of erroneous expressions in the same percentage and (b) decide which machine learning approach performs this task better. When the TERSEO system is combined with the maximum entropy approach the best results for F-measure (89%) are obtained, improving TERSEO recognition by 4.5 points and ME recognition by 7.


cross language evaluation forum | 2006

Monolingual and cross-lingual QA using AliQAn and BRILI systems for CLEF 2006

Sergio Ferrández; Pilar López-Moreno; Sandra Roger; Antonio Ferrández; Jesús Peral; X. Alvarado; Elisa Noguera; Fernando Llopis

A previous version of AliQAn participated in the CLEF 2005 Spanish Monolingual Question Answering task. For this years run, to the system are added new and representative patterns in question analysis and extraction of the answer. A new ontology of question types has been included. The inexact questions have been improved. The information retrieval engine has been modified considering only the detected keywords from the question analysis module. Besides, many PoS Tagger and SUPAR errors have been solved and finally, dictionaries about cities and countries have been incorporated. To deal with Cross-Lingual tasks, we employ the BRILI system. The achieved results are overall accuracy of 37.89% for monolingual and 21.58% for bilingual tasks.


language resources and evaluation | 2012

Web 2.0, Language Resources and standards to automatically build a multilingual Named Entity Lexicon

Antonio Toral; Sergio Ferrández; Monica Monachini; Rafael Muñoz

This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (1) the knowledge available in existing LRs, (2) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (3) the use of standards to improve interoperability. We present a case study in which a set of LRs for different languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which affects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The different steps of the procedure (mapping, disambiguation, extraction, NE identification and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented.


international conference on computational linguistics | 2009

A Parallel Corpus Labeled Using Open and Restricted Domain Ontologies

Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; José L. Vicedo

The analysis and creation of annotated corpus is fundamental for implementing natural language processing solutions based on machine learning. In this paper we present a parallel corpus of 4500 questions in Spanish and English on the touristic domain, obtained from real users. With the aim of training a question answering system, the questions were labeled with the expected answer type, according to two different ontologies. The first one is an open domain ontology based on Sekines Extended Named Entity Hierarchy, while the second one is a restricted domain ontology, specific for the touristic field. Due to the use of two ontologies with different characteristics, we had to solve many problematic cases and adjusted our annotation thinking on the characteristics of each one. We present the analysis of the domain coverage of these ontologies and the results of the inter-annotator agreement. Finally we use a question classification system to evaluate the labeling of the corpus.


conference on human system interactions | 2009

A proposal of Expected Answer Type and Named Entity annotation in a Question Answering context

Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; Óscar Ferrández; José L. Vicedo

This paper presents our research related to automatic Expected Answer Type and Named Entity annotation tasks in a Question Answering context. We present the initial step of our research, in which we created the annotation guidelines. We therefore show and justify the tag set employed in the annotation of a collection of questions, and finally, different evaluations in order to test the consistency of the labelled corpus are also presented.


international conference on computational linguistics | 2009

The Negative Effect of Machine Translation on Cross---Lingual Question Answering

Sergio Ferrández; Antonio Ferrández

This paper presents a study of the negative effect of Machine Translation (MT) on the precision of Cross---Lingual Question Answering (CL---QA). For this research, a English---Spanish Question Answering (QA) system is used. Also, the sets of 200 official questions from CLEF 2004 and 2006 are used. The CL experimental evaluation using MT reveals that the precision of the system drops around 30% with regard to the monolingual Spanish task. Our main contribution consists on a taxonomy of the identified errors caused by using MT and how the errors can be overcome by using our proposals. An experimental evaluation proves that our approach performs better than MT tools, at the same time contributing to this CL---QA system being ranked first at English---Spanish QA CLEF 2006.

Collaboration


Dive into the Sergio Ferrández's collaboration.

Top Co-Authors

Avatar

Antonio Ferrández

University of Wolverhampton

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge