Is this you? Create Your Porfile

Diego Marcheggiani

Istituto di Scienza e Tecnologie dell'Informazione

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Diego Marcheggiani is active.

Explore More

Publication

Featured researches published by Diego Marcheggiani.

european conference on information retrieval | 2014

Hierarchical Multi-label Conditional Random Fields for Aspect-Oriented Opinion Mining

Diego Marcheggiani; Oscar Täckström; Andrea Esuli; Fabrizio Sebastiani

A common feature of many online review sites is the use of an overall rating that summarizes the opinions expressed in a review. Unfortunately, these document-level ratings do not provide any information about the opinions contained in the review that concern a specific aspect e.g., cleanliness of the product being reviewed e.g., a hotel. In this paper we study the finer-grained problem of aspect-oriented opinion mining at the sentence level, which consists of predicting, for all sentences in the review, whether the sentence expresses a positive, neutral, or negative opinion or no opinion at all about a specific aspect of the product. For this task we propose a set of increasingly powerful models based on conditional random fields CRFs, including a hierarchical multi-label CRFs scheme that jointly models the overall opinion expressed in the review and the set of aspect-specific opinions expressed in each of its sentences. We evaluate the proposed models against a dataset of hotel reviews which we here make publicly available in which the set of aspects and the opinions expressed concerning them are manually annotated at the sentence level. We find that both hierarchical and multi-label factors lead to improved predictions of aspect-oriented opinions.

theory and practice of digital libraries | 2012

Metadata enrichment services for the europeana digital library

Giacomo Berardi; Andrea Esuli; Sergiu Gordea; Diego Marcheggiani; Fabrizio Sebastiani

We demonstrate a metadata enrichment system for the Europeana digital library. The system allows different institutions which provide to Europeana pointers (in the form of metadata records - MRs) to their content to enrich their MRs by classifying them under a classification scheme of their choice, and to extract/highlight entities of significant interest within the MRs themselves. The use of a supervised learning metaphor allows each content provider (CP) to generate classifiers and extractors tailored to the CPs specific needs, thus allowing the tool to be effectively available to the multitude (2000+) of Europeana CPs.

empirical methods in natural language processing | 2015

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

Salud María Jiménez-Zafra; Giacomo Berardi; Andrea Esuli; Diego Marcheggiani; María Teresa Martín-Valdivia; Alejandro Moreo Fernández

We present the Trip-MAML dataset, a Multi-Lingual dataset of hotel reviews that have been manually annotated at the sentence-level with Multi-Aspect sentiment labels. This dataset has been built as an extension of an existent English-only dataset, adding documents written in Italian and Spanish. We detail the dataset construction process, covering the data gathering, selection, and annotation. We present inter-annotator agreement figures and baseline experimental results, comparing the three languages. Trip-MAML is a multi-lingual dataset for aspect-oriented opinion mining that enables researchers (i) to face the problem on languages other than English and (ii) to the experiment the application of cross-lingual learning methods to the task.

acm symposium on applied computing | 2015

On the impact of entity linking in microblog real-time filtering

Giacomo Berardi; Diego Ceccarelli; Andrea Esuli; Diego Marcheggiani

Microblogging is a model of content sharing in which the temporal locality of posts with respect to important events, either of foreseeable or unforeseeable nature, makes applications of real-time filtering of great practical interest. We propose the use of Entity Linking (EL) in order to improve the retrieval effectiveness, by enriching the representation of microblog posts and filtering queries. EL is the process of recognizing in an unstructured text the mention of relevant entities described in a knowledge base. EL of short pieces of text is a difficult task, but it is also a scenario in which the information EL adds to the text can have a substantial impact on the retrieval process. We implement a start-of-the-art filtering method, based on the best systems from the TREC Microblog track real-time adhoc retrieval and filtering tasks, and extend it with a Wikipedia-based EL method. Results show that the use of EL significantly improves over non-EL based versions of the filtering methods.

Journal of Biomedical Informatics | 2013