Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Simone Teufel is active.

Publication


Featured researches published by Simone Teufel.


language resources and evaluation | 2004

MEAD - A Platform for Multidocument Multilingual Text Summarization

Dragomir R. Radev; Timothy Allison; Sasha Blair-Goldensohn; John Blitzer; Arda Çelebi; Stanko Dimitrov; Elliott Franco Drábek; Ali Hakim; Wai Lam; Danyu Liu; Jahna Otterbacher; Hong Qi; Horacio Saggion; Simone Teufel; Michael Topper; Adam Winkel; Zhu Zhang

Abstract This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection.


empirical methods in natural language processing | 2006

Automatic classification of citation function

Simone Teufel; Advaith Siddharthan; Dan Tidhar

Citation function is defined as the authors reason for citing a given paper (e.g. acknowledgement of the use of the cited method). The automatic recognition of the rhetorical function of citations in scientific text has many applications, from improvement of impact factor calculations to text summarisation and more informative citation indexers. We show that our annotation scheme for citation function is reliable, and present a supervised machine learning framework to automatically classify citation function, using both shallow and linguistically-inspired features. We find, amongst other things, a strong relationship between citation function and sentiment classification.


conference of the european chapter of the association for computational linguistics | 1999

An annotation scheme for discourse-level argumentation in research articles

Simone Teufel; Jean Carletta; Marc Moens

In order to build robust automatic abstracting systems, there is a need for better training resources than are currently available. In this paper, we introduce an annotation scheme for scientific articles which can be used to build such a resource in a consistent way. The seven categories of the scheme are based on rhetorical moves of argumentation. Our experimental results show that the scheme is stable, reproducible and intuitive to use.


empirical methods in natural language processing | 2009

Towards Domain-Independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics

Simone Teufel; Advaith Siddharthan; Colin R. Batchelor

Argumentative Zoning (AZ) is an analysis of the argumentative and rhetorical structure of a scientific paper. It has been shown to be reliably used by independent human coders, and has proven useful for various information access tasks. Annotation experiments have however so far been restricted to one discipline, computational linguistics (CL). Here, we present a more informative AZ scheme with 15 categories in place of the original 7, and show that it can be applied to the life sciences as well as to CL. We use a domain expert to encode basic knowledge about the subject (such as terminology and domain specific rules for individual categories) as part of the annotation guidelines. Our results show that non-expert human coders can then use these guidelines to reliably annotate this scheme in two domains, chemistry and computational linguistics.


north american chapter of the association for computational linguistics | 2003

Examining the consensus between human summaries: initial experiments with factoid analysis

Hans van Halteren; Simone Teufel

We present a new approach to summary evaluation which combines two novel aspects, namely (a) content comparison between gold standard summary and system summary via factoids, a pseudo-semantic representation based on atomic information units which can be robustly marked in text, and (b) use of a gold standard consensus summary, in our case based on 50 individual summaries of one text. Even though future work on more than one source text is imperative, our experiments indicate that (1) ranking with regard to a single gold standard summary is insufficient as rankings based on any two randomly chosen summaries are very dissimilar (correlations average ρ = 0.20), (2) a stable consensus summary can only be expected if a larger number of summaries are collected (in the range of at least 30--40 summaries), and (3) similarity measurement using unigrams shows a similarly low ranking correlation when compared with factoid-based ranking.


meeting of the association for computational linguistics | 2003

Evaluation Challenges in Large-Scale Document Summarization

Dragomir R. Radev; Simone Teufel; Horacio Saggion; Wai Lam; John Blitzer; Hong Qi; Arda Çelebi; Danyu Liu; Elliott Franco Drábek

We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 Million automatic summaries using six summarizers and baselines at ten summary lengths in both English and Chinese, (b) more than 10,000 manual abstracts and extracts, and (c) 200 Million automatic document and summary retrievals using 20 queries. We present both qualitative and quantitative results showing the strengths and draw-backs of all evaluation methods and how they rank the different summarizers.


ANARESOLUTION '97 Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts | 1997

Resolving bridging references in unrestricted text

Massimo Poesio; Renata Vieira; Simone Teufel

Our goal is to develop a system capable of treating the largest possible subset of definite descriptions in unrestricted written texts. A previous prototype resolved anaphoric uses of definite descriptions and identified some types of first-mention uses, achieving a recall of 56%. In this paper we present the latest version of our system, which handles some types of bridging references, uses WordNet as a source of lexical knowledge, and achieves a recall of 65%.


Computational Linguistics | 2013

Statistical metaphor processing

Ekaterina Shutova; Simone Teufel; Anna Korhonen

Metaphor is highly frequent in language, which makes its computational processing indispensable for real-world NLP applications addressing semantic tasks. Previous approaches to metaphor modeling rely on task-specific hand-coded knowledge and operate on a limited domain or a subset of phenomena. We present the first integrated open-domain statistical model of metaphor processing in unrestricted text. Our method first identifies metaphorical expressions in running text and then paraphrases them with their literal paraphrases. Such a text-to-text model of metaphor interpretation is compatible with other NLP applications that can benefit from metaphor resolution. Our approach is minimally supervised, relies on the state-of-the-art parsing and lexical acquisition technologies (distributional clustering and selectional preference induction), and operates with a high accuracy.


conference on information and knowledge management | 2008

Comparing citation contexts for information retrieval

Anna Ritchie; Stephen E. Robertson; Simone Teufel

In previous work, we have shown that using terms from around citations in citing papers to index the cited paper, in addition to the cited papers own terms, can improve retrieval effectiveness. Now, we investigate how to select text from around the citations in order to extract good index terms. We compare the retrieval effectiveness that results from a range of contexts around the citations, including no context, the entire citing paper, some fixed windows and several variations with linguistic motivations. We conclude with an analysis of the benefits of more complex, linguistically motivated methods for extracting citation index terms, over using a fixed window of terms. We speculate that there might be some advantage to using computational linguistic techniques for this task.


acm/ieee joint conference on digital libraries | 2001

PERSIVAL, a system for personalized search and summarization over multimedia healthcare information

Kathleen R. McKeown; Shih-Fu Chang; James J. Cimino; Steven Feiner; Carol Friedman; Luis Gravano; Vasileios Hatzivassiloglou; Steven Johnson; Desmond A. Jordan; Judith L. Klavans; Andre W. Kushniruk; Vimla L. Patel; Simone Teufel

In healthcare settings, patients need access to online information tha t can help them understand their medical situation. Physicians need information that is clinically relevant to an individual patient. In this paper, we present our progress on developing a system, PERSIVAL, that is designed to provide personalized access to a distributed patient care digital library. Using the secure, online patient records at New York Presbyterian Hospital as a user model, PERSIVALs components tailor search, presentation and summarization of online multimedia information to both patients and healthcare providers.

Collaboration


Dive into the Simone Teufel's collaboration.

Top Co-Authors

Avatar

Marc Moens

University of Edinburgh

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wai Lam

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anna Ritchie

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar

Takenobu Tokunaga

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge