Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alicia Ageno is active.

Publication


Featured researches published by Alicia Ageno.


ACM Computing Surveys | 2006

Adaptive information extraction

Jordi Turmo; Alicia Ageno; Neus Català

The growing availability of online textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extraction (IE) research. Some examples of these applications are the generation of data bases from documents, as well as the acquisition of knowledge useful for emerging technologies like question answering, information integration, and others related to text mining. However, one of the main drawbacks of the application of IE refers to its intrinsic domain dependence. For the sake of reducing the high cost of manually adapting IE applications to new domains, experiments with different Machine Learning (ML) techniques have been carried out by the research community. This survey describes and compares the main approaches to IE and the different ML techniques used to achieve Adaptive IE technology.


Machine Translation | 1995

Acquisition of lexical translation relations from MRDS

Ann Gopestake; Ted Briscoe; Piek Vossen; Alicia Ageno; Irene Castellón; Francesc Ribas; German Rigau; Horacio Rodríguez; Anna Samiotou

In this paper we present a methodology for extracting information about lexical translation equivalences from the machine readable versions of conventional dictionaries (MRDs), and describe a series of experiments on semi-automatic construction of a linked multilinguallexical knowledge base for English, Dutch, and Spanish. We discuss the advantages and limitations of using MRDs that this has revealed, and some strategies we have developed to cover gaps where no direct translation can be found.


knowledge discovery and data mining | 2005

A hybrid unsupervised approach for document clustering

Mihai Surdeanu; Jordi Turmo; Alicia Ageno

We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to automatically select a subset of the clusters generated by the first algorithm as the initial points of the second one. Furthermore, our initialization algorithm generates not only an initial model for the iterative refinement algorithm but also an estimate of the model dimension, thus eliminating another important element of human supervision. We have evaluated the proposed system on five real-world document collections. The results show that our approach generates clustering solutions of higher quality than both its individual components.


cross-language evaluation forum | 2005

The TALP-QA system for spanish at CLEF 2005

Daniel Ferrés; Samir Kanaan; Alicia Ageno; Edgar González; Horacio Rodríguez; Jordi Turmo

This paper describes the TALP-QA system in the context of the CLEF 2005 Spanish Monolingual Question Answering (QA) evaluation task. TALP-QA is a multilingual open-domain QA system that processes both factoid (normal and temporally restricted) and definition questions. The approach to factoid questions is based on in-depth NLP tools and resources to create semantic information representation. Answers to definition questions are selected from the phrases that match a pattern from a manually constructed set of definitional patterns.


cross language evaluation forum | 2004

The TALP-QA system for spanish at CLEF 2004: structural and hierarchical relaxing of semantic constraints

Daniel Ferrés; Samir Kanaan; Alicia Ageno; Edgar González; Horacio Rodríguez; Mihai Surdeanu; Jordi Turmo

This paper describes TALP-QA, a multilingual open-domain Question Answering (QA) system that processes both factoid and definition questions. The system is described and evaluated in the context of our participation in the CLEF 2004 Spanish Monolingual QA task. Our approach to factoid questions is to build a semantic representation of the questions and the sentences in the passages retrieved for each question. A set of Semantic Constraints (SC) are extracted for each question. An answer extraction algorithm extracts and ranks sentences that satisfy the SCs of the question. If matches are not possible the algorithm relaxes the SCs structurally (removing constraints) and/or hierarchically (abstracting the constraints using a taxonomy). Answers to definition questions are generated by selecting the text fragment with more density of those terms more frequently related to the questions target (the Named Entity (NE) that appears in the question) throughout the corpus.


cross language evaluation forum | 2005

The GeoTALP-IR system at GeoCLEF 2005: experiments using a QA-Based IR system, linguistic analysis, and a geographical thesaurus

Daniel Ferrés; Alicia Ageno; Horacio Rodríguez

This paper describes GeoTALP-IR system, a Geographical Information Retrieval (GIR) system. The system is described and evaluated in the context of our participation in the CLEF 2005 GeoCLEF Monolingual English task. The GIR system is based on Lucene and uses a modified version of the Passage Retrieval module of the TALP Question Answering (QA) system presented at CLEF 2004 and TREC 2004 QA evaluation tasks. We designed a Keyword Selection algorithm based on a Linguistic and Geographical Analysis of the topics. A Geographical Thesaurus (GT) has been built using a set of publicly available Geographical Gazetteers and a Geographical Ontology. Our experiments show that the use of a Geographical Thesaurus for Geographical Indexing and Retrieval has improved the performance of our GIR system.


text speech and dialogue | 2000

Extending Bidirectional Chart Parsing with a Stochastic Model

Alicia Ageno; Horacio Rodríguez

A method for stochasticallymodelling bidirectionalityin chart parsing is presented. A bidirectional parser, which starts analysis from certain dynamically determined positions of the sentence (the islands), has been built. This island-driven parser uses the stochastic model to guide the recognition process. The system has been trained and tested over two wide-coverage corpus: Spanish Lexesp and English Penn Treebank. Results regarding comparison of our approach with the basic Bottom-Up are encouraging.


text retrieval conference | 2004

TALP-QA System at TREC 2004: Structural and Hierarchical Relaxation Over Semantic Constraints.

Daniel Ferrés; Samir Kanaan; Edgar González; Alicia Ageno; Horacio Rodríguez; Mihai Surdeanu; Jordi Turmo


international conference on computational linguistics | 1994

TGE: Tlinks Generation Environment

Alicia Ageno; Francese Ribas; German Rigau; Horacio Rodríguez; Anna Samiotou


CLEF (Working Notes) | 2004

TALP-QA System for Spanish at CLEF-2004.

Alicia Ageno; Daniel Ferrés; Edgar González; Samir Kanaan; Horacio Rodríguez; Mihai Surdeanu; Jordi Turmo

Collaboration


Dive into the Alicia Ageno's collaboration.

Top Co-Authors

Avatar

Horacio Rodríguez

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Jordi Turmo

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Edgar González

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Samir Kanaan

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anna Samiotou

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Antonia Soler

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge