Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Horacio Saggion is active.

Publication


Featured researches published by Horacio Saggion.


language resources and evaluation | 2004

MEAD - A Platform for Multidocument Multilingual Text Summarization

Dragomir R. Radev; Timothy Allison; Sasha Blair-Goldensohn; John Blitzer; Arda Çelebi; Stanko Dimitrov; Elliott Franco Drábek; Ali Hakim; Wai Lam; Danyu Liu; Jahna Otterbacher; Hong Qi; Horacio Saggion; Simone Teufel; Michael Topper; Adam Winkel; Zhu Zhang

Abstract This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection.


international semantic web conference | 2007

Ontology-based information extraction for business intelligence

Horacio Saggion; Adam Funk; Diana Maynard; Kalina Bontcheva

Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.


Computational Linguistics | 2002

Generating indicative-informative summaries with sumUM

Horacio Saggion; Guy Lapalme

We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the readers interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies.


Natural Language Engineering | 2002

Architectural elements of language engineering robustness

Diana Maynard; Valentin Tablan; Hamish Cunningham; Cristian Ursu; Horacio Saggion; Kalina Bontcheva; Yorick Wilks

We discuss robustness in LE systems from the perspective of engineering, and the predictability of both outputs and construction process that this entails. We present an architectural system that contributes to engineering robustness and low-overhead systems development (GATE, a General Architecture for Text Engineering). To verify our ideas we present results from the development of a multi-purpose cross-genre Named Entity recognition system. This system aims be robust across diverse input types, and to reduce the need for costly and timeconsuming adaptation of systems to new applications, with its capability to process texts from widely differing domains and genres.


meeting of the association for computational linguistics | 2003

Evaluation Challenges in Large-Scale Document Summarization

Dragomir R. Radev; Simone Teufel; Horacio Saggion; Wai Lam; John Blitzer; Hong Qi; Arda Çelebi; Danyu Liu; Elliott Franco Drábek

We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 Million automatic summaries using six summarizers and baselines at ten summary lengths in both English and Chinese, (b) more than 10,000 manual abstracts and extracts, and (c) 200 Million automatic document and summary retrievals using 20 queries. We present both qualitative and quantitative results showing the strengths and draw-backs of all evaluation methods and how they rank the different summarizers.


north american chapter of the association for computational linguistics | 2000

Concept identification and presentation in the context of technical text summarization

Horacio Saggion; Guy Lapalme

We describe a method of text summarization that produces indicative-informative abstracts for technical papers. The abstracts are generated by a process of conceptual identification, topic extraction and re-generation. We have carried out an evaluation to assess indicativeness and text acceptability relying on human judgment. The results so far indicate good performance in both tasks when compared with other summarization technologies.


Multi-source, Multilingual Information Extraction and Summarization | 2013

Automatic Text Summarization: Past, Present and Future

Horacio Saggion; Thierry Poibeau

Automatic text summarization, the computer-based production of condensed versions of documents, is an important technology for the information society. Without summaries it would be practically impossible for human beings to get access to the ever growing mass of information available online. Although research in text summarization is over 50 years old, some efforts are still needed given the insufficient quality of automatic summaries and the number of interesting summarization topics being proposed in different contexts by end users (“domain-specific summaries”, “opinion-oriented summaries”, “update summaries”, etc.). This paper gives a short overview of summarization methods and evaluation.


data and knowledge engineering | 2004

Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project

Horacio Saggion; Hamish Cunningham; Kalina Bontcheva; Diana Maynard; Oana Hamza; Yorick Wilks

We describe our work on information extraction from multiple sources for the Multimedia Indexing and Searching Environment, a project aiming at developing technology to produce formal annotations about essential events in multimedia programme material. The creation of a composite index from multiple and multi-lingual sources is a unique aspect of this project. The domain chosen for tuning the software components and testing is football. Our information extraction system is based on the use of finite state machinery pipelined with full semantic analysis and discourse interpretation.


international conference on human-computer interaction | 2013

Frequent Words Improve Readability and Short Words Improve Understandability for People with Dyslexia

Luz Rello; Ricardo A. Baeza-Yates; Laura Dempere-Marco; Horacio Saggion

Around 10% of the population has dyslexia, a reading disability that negatively affects a person’s ability to read and comprehend texts. Previous work has studied how to optimize the text layout, but adapting the text content has not received that much attention. In this paper, we present an eye-tracking study that investigates if people with dyslexia would benefit from content simplification. In an experiment with 46 people, 23 with dyslexia and 23 as a control group, we compare texts where words were substituted by shorter/longer and more/less frequent synonyms. Using more frequent words caused the participants with dyslexia to read significantly faster, while the use of shorter words caused them to understand the text better. Amongst the control group, no significant effects were found. These results provide evidence that people with dyslexia may benefit from interactive tools that perform lexical simplification.


international conference on computational linguistics | 2002

Meta-evaluation of summaries in a cross-lingual environment using content-based metrics

Horacio Saggion; Simone Teufel; Dragomir R. Radev; Wai Lam

We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community.

Collaboration


Dive into the Horacio Saggion's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sanja Štajner

University of Wolverhampton

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Luz Rello

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefan Bott

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar

Yorick Wilks

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge