David Tomás
University of Alicante
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Tomás.
Journal of Web Semantics | 2011
Óscar Ferrández; Christian Spurk; Milen Kouylekov; Iustin Dornescu; Sergio Ferrández; Matteo Negri; Rubén Izquierdo; David Tomás; Constantin Orasan; Guenter Neumann; Bernardo Magnini; José L. Vicedo
This paper presents the QALL-ME Framework, a reusable architecture for building multi- and cross-lingual Question Answering (QA) systems working on structured data modelled by an ontology. It is released as free open source software with a set of demo components and extensive documentation, which makes it easy to use and adapt. The main characteristics of the QALL-ME Framework are: (i) its domain portability, achieved by an ontology modelling the target domain; (ii) the context awareness regarding space and time of the question; (iii) the use of textual entailment engines as the core of the question interpretation; and (iv) an architecture based on Service Oriented Architecture (SOA), which is realized using interchangeable web services for the framework components. Furthermore, we present a running example to clarify how the framework processes questions as well as a case study that shows a QA application built as an instantiation of the QALL-ME Framework for cinema/movie events in the tourism domain.
cross language evaluation forum | 2005
Sandra Roger; Sergio Ferrández; Antonio Ferrández; Jesús Peral; Fernando Llopis; Antonia Aguilar; David Tomás
Question Answering is a major research topic at the University of Alicante. For this reason, this year two groups participated in the QA@CLEF track using different approaches. In this paper we describe the work of Alicante 2 group. This paper describes AliQAn, a monolingual open-domain Question Answering (QA) System developed in the Department of Language Processing and Information Systems at the University of Alicante for CLEF-2005 Spanish monolingual QA evaluation task. Our approach is based fundamentally on the use of syntactic pattern recognition in order to identify possible answers. Besides this, Word Sense Disambiguation (WSD) is applied to improve the system. The results achieved (overall accuracy of 33%) are shown and discussed in the paper.
Knowledge and Information Systems | 2013
David Tomás; José L. Vicedo
This article presents a minimally supervised approach to question classification on fine-grained taxonomies. We have defined an algorithm that automatically obtains lists of weighted terms for each class in the taxonomy, thus identifying which terms are highly related to the classes and are highly discriminative between them. These lists have then been applied to the task of question classification. Our approach is based on the divergence of probability distributions of terms in plain text retrieved from the Web. A corpus of questions with which to train the classifier is not therefore necessary. As the system is based purely on statistical information, it does not require additional linguistic resources or tools. The experiments were performed on English questions and their Spanish translations. The results reveal that our system surpasses current supervised approaches in this task, obtaining a significant improvement in the experiments carried out.
international conference natural language processing | 2006
David Tomás; José L. Vicedo; Empar Bisbal; Lidia Moreno
Question classification is one of the first tasks carried out in a Question Answering system. In this paper we present a multilingual question classification system based on machine learning techniques. We use Support Vector Machines to classify the questions. All the features needed to train and test this method are automatically extracted through statistical information in an unsupervised way, comparing Poisson distributions of single words in two plain corpora of questions and documents. Thus, we need nothing but plain text to train the system, obtaining a flexible approach easy to adapt to new languages and domains. We have tested it on a bilingual corpus of questions in English and Spanish.
mexican international conference on artificial intelligence | 2005
Empar Bisbal; David Tomás; Lidia Moreno; José L. Vicedo; Armando Suárez
Question Classification (QC) is usually the first stage in a Question Answering system. This paper presents a multilingual SVM-based question classification system aiming to be language and domain independent. For this purpose, we use only surface text features. The system has been tested on the TREC QA track questions set obtaining encouraging results.
International Journal of Virtual Communities and Social Networking | 2015
Andreas Menychtas; David Tomás; Marco Tiemann; Christina Santzaridou; Alexandros Psychas; Dimosthenis Kyriazis; Juan Vicente Vidagany Espert; Stuart Campbell
Todays generation of Internet devices has changed how users are interacting with media, from passive and unidirectional users to proactive and interactive. Users can use these devices to comment or rate a TV show and search for related information regarding characters, facts or personalities. This phenomenon is known as second screen. This paper describes SAM, an EU-funded research project that focuses on developing an advanced digital media delivery platform based on second screen interaction and content syndication within a social media context, providing open and standardised ways of characterising, discovering and syndicating digital assets. This work provides an overview of the project and its main objectives, focusing on the NLP challenges to be faced and the technologies developed so far.
international conference on computational linguistics | 2009
Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; José L. Vicedo
The analysis and creation of annotated corpus is fundamental for implementing natural language processing solutions based on machine learning. In this paper we present a parallel corpus of 4500 questions in Spanish and English on the touristic domain, obtained from real users. With the aim of training a question answering system, the questions were labeled with the expected answer type, according to two different ontologies. The first one is an open domain ontology based on Sekines Extended Named Entity Hierarchy, while the second one is a restricted domain ontology, specific for the touristic field. Due to the use of two ontologies with different characteristics, we had to solve many problematic cases and adjusted our annotation thinking on the characteristics of each one. We present the analysis of the domain coverage of these ontologies and the results of the inter-annotator agreement. Finally we use a question classification system to evaluate the labeling of the corpus.
conference on human system interactions | 2009
Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; Óscar Ferrández; José L. Vicedo
This paper presents our research related to automatic Expected Answer Type and Named Entity annotation tasks in a Question Answering context. We present the initial step of our research, in which we created the annotation guidelines. We therefore show and justify the tag set employed in the annotation of a collection of questions, and finally, different evaluations in order to test the consistency of the labelled corpus are also presented.
text speech and dialogue | 2007
David Tomás; Jose-Luis Vicedo
In this paper we present a novel multiple-taxonomy question classification system, facing the challenge of assigning categories inmultiple taxonomies to natural language questions. We applied our system to category search on faceted information. The system provides a natural language interface to faceted information, detecting the categories requested by the user and narrowing down the document search space to those documents pertaining to the facet values identified. The system was developed in the framework of language modeling, and the models to detect categories are inferred directly from the corpus of documents.
acm international conference on interactive experiences for tv and online video | 2015
Atta Badii; Marco Tiemann; Andreas Menychtas; Christina Santzaridou; Alexandros Psychas; David Tomás; Stuart Campbell; Juan Vicente Vidagany Espert
Social media services offer a wide range of opportunities for businesses and developers to exploit the vast amount of information and user-generated content produced via social media. In addition, the notion of TV second screen usage -- the interleaved usage of TV and smart devices such as smartphones -- appears ever more prominent, with viewers continuously seeking further information and deeper engagement while watching movies, TV shows or event coverage. In this work-in-progress contribution, we present SAM, an innovative platform that combines social media, content syndication and targets second screen usage to enhance media content provisioning and advance the user experience. SAM incorporates modern technologies and novel features in the areas of content management, dynamic social media, social mining, semantic annotation and multi-device representation to facilitate an advanced business environment for broadcasters, content and metadata providers and editors to better exploit their assets and increase revenues.