Simone Paolo Ponzetto

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Simone Paolo Ponzetto is active.

Explore More

Publication

Featured researches published by Simone Paolo Ponzetto.

Artificial Intelligence | 2012

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

Roberto Navigli; Simone Paolo Ponzetto

We present an automatic approach to the construction of BabelNet, a very large, wide-coverage multilingual semantic network. Key to our approach is the integration of lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition, Machine Translation is applied to enrich the resource with lexical information for all languages. We first conduct in vitro experiments on new and existing gold-standard datasets to show the high quality and coverage of BabelNet. We then show that our lexical resource can be used successfully to perform both monolingual and cross-lingual Word Sense Disambiguation: thanks to its wide lexical coverage and novel semantic relations, we are able to achieve state-of the-art results on three different SemEval evaluation tasks.

language and technology conference | 2006

Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution

Simone Paolo Ponzetto; Michael Strube

In this paper we present an extension of a machine learning based coreference resolution system which uses features induced from different semantic knowledge sources. These features represent knowledge mined from WordNet and Wikipedia, as well as information about semantic role labels. We show that semantic features indeed improve the performance on different referring expression types such as pronouns and common nouns.

Journal of Artificial Intelligence Research | 2007

Knowledge derived from wikipedia for computing semantic relatedness

Simone Paolo Ponzetto; Michael Strube

Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.

Artificial Intelligence | 2013

Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Eduard H. Hovy; Roberto Navigli; Simone Paolo Ponzetto

Recent years have seen a great deal of work that exploits collaborative, semi-structured content for Artificial Intelligence (AI) and Natural Language Processing (NLP). This special issue of the Artificial Intelligence Journal presents a variety of state-of-the-art contributions, each of which illustrates the substantial impact that work on leveraging semi-structured content is having on AI and NLP as it continuously fosters new directions of cutting-edge research. We contextualize the papers collected in this special issue by providing a detailed overview of previous work on collaborative, semi-structured resources. The survey is made up of two main logical parts: in the first part, we present the main characteristics of collaborative resources that make them attractive for AI and NLP research; in the second part, we present an overview of how these features have been exploited to tackle a variety of long-standing issues in the two fields, in particular the acquisition of large amounts of machine-readable knowledge, and its application to a wide range of tasks. The overall picture shows that not only are semi-structured resources enabling a renaissance of knowledge-rich AI techniques, but also that significant advances in high-end applications that require deep understanding capabilities can be achieved by synergistically exploiting large amounts of machine-readable structured knowledge in combination with sound statistical AI and NLP techniques.

Artificial Intelligence | 2011

Taxonomy induction based on a collaboratively built knowledge repository

Simone Paolo Ponzetto; Michael Strube

The category system in Wikipedia can be taken as a conceptual network. We label the semantic relations between categories using methods based on connectivity in the network and lexico-syntactic matching. The result is a large scale taxonomy. For evaluation we propose a method which (1) manually determines the quality of our taxonomy, and (2) automatically compares its coverage with ResearchCyc, one of the largest manually created ontologies, and the lexical database WordNet. Additionally, we perform an extrinsic evaluation by computing semantic similarity between words in benchmarking datasets. The results show that the taxonomy compares favorably in quality and coverage with broad-coverage manually created resources.

meeting of the association for computational linguistics | 2008

BART: A Modular Toolkit for Coreference Resolution

Yannick Versley; Simone Paolo Ponzetto; Massimo Poesio; Vladimir Eidelman; Alan Jern; Jason Smith; Xiaofeng Yang; Alessandro Moschitti

Developing a full coreference system able to run all the way from raw text to semantic interpretation is a considerable engineering effort, yet there is very limited availability of off-the shelf tools for researchers whose interests are not in coreference, or for researchers who want to concentrate on a specific aspect of the problem. We present BART, a highly modular toolkit for developing coreference applications. In the Johns Hopkins workshop on using lexical and encyclopedic knowledge for entity disambiguation, the toolkit was used to extend a reimplementation of the Soon et al. (2001) proposal with a variety of additional syntactic and knowledge-based features, and experiment with alternative resolution processes, preprocessing tools, and classifiers.

conference of the european chapter of the association for computational linguistics | 2006

Semantic role labeling for coreference resolution

Simone Paolo Ponzetto; Michael Strube

Extending a machine learning based coreference resolution system with a feature capturing automatically generated information about semantic roles improves its performance.

european semantic web conference | 2014

A Probabilistic Approach for Integrating Heterogeneous Knowledge Sources

Arnab Dutta; Christian Meilicke; Simone Paolo Ponzetto

Open Information Extraction (OIE) systems like Nell and ReVerb have achieved impressive results by harvesting massive amounts of machine-readable knowledge with minimal supervision. However, the knowledge bases they produce still lack a clean, explicit semantic data model. This, on the other hand, could be provided by full-fledged semantic networks like DBpedia or Yago, which, in turn, could benefit from the additional coverage provided by Web-scale IE. In this paper, we bring these two strains of research together, and present a method to align terms from Nell with instances in DBpedia. Our approach is unsupervised in nature and relies on two key components. First, we automatically acquire probabilistic type information for Nell terms given a set of matching hypotheses. Second, we view the mapping task as the statistical inference problem of finding the most likely coherent mapping – i.e., the maximum a posteriori (MAP) mapping – based on the outcome of the first component used as soft constraint. These two steps are highly intertwined: accordingly, we propose an approach that iteratively refines type acquisition based on the output of the mapping generator, and vice versa. Experimental results on gold-standard data indicate that our approach outperforms a strong baseline, and is able to produce ever-improving mappings consistently across iterations.

conference on information and knowledge management | 2015

Ranking Entities for Web Queries Through Text and Knowledge

Michael Schuhmacher; Laura Dietz; Simone Paolo Ponzetto

When humans explain complex topics, they naturally talk about involved entities, such as people, locations, or events. In this paper, we aim at automating this process by retrieving and ranking entities that are relevant to understand free-text web-style queries like Argentine British relations, which typically demand a set of heterogeneous entities with no specific target type like, for instance, Falklands_-War} or Margaret-_Thatcher, as answer. Standard approaches to entity retrieval rely purely on features from the knowledge base. We approach the problem from the opposite direction, namely by analyzing web documents that are found to be query-relevant. Our approach hinges on entity linking technology that identifies entity mentions and links them to a knowledge base like Wikipedia. We use a learning-to-rank approach and study different features that use documents, entity mentions, and knowledge base entities -- thus bridging document and entity retrieval. Since established benchmarks for this problem do not exist, we use TREC test collections for document ranking and collect custom relevance judgments for entities. Experiments on TREC Robust04 and TREC Web13/14 data show that: i) single entity features, like the frequency of occurrence within the top-ranke documents, or the query retrieval score against a knowledge base, perform generally well; ii) the best overall performance is achieved when combining different features that relate an entity to the query, its document mentions, and its knowledge base representation.

meeting of the association for computational linguistics | 2007