Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Anna Lisa Gentile is active.

Publication


Featured researches published by Anna Lisa Gentile.


Natural Language Engineering | 2013

Recent advances in methods of lexical semantic relatedness – a survey

Ziqi Zhang; Anna Lisa Gentile; Fabio Ciravegna

Measuring lexical semantic relatedness is an important task in Natural Language Processing (NLP). It is often a prerequisite to many complex NLP tasks. Despite an extensive amount of work dedicated to this area of research, there is a lack of an up-to-date survey in the field. This paper aims to address this issue with a study that is focused on four perspectives: (i) a comparative analysis of background information resources that are essential for measuring lexical semantic relatedness; (ii) a review of the literature with a focus on recent methods that are not covered in previous surveys; (iii) discussion of the studies in the biomedical domain where novel methods have been introduced but inadequately communicated across the domain boundaries; and (iv) an evaluation of lexical semantic relatedness methods and a discussion of useful lessons for the development and application of such methods. In addition, we discuss a number of issues in this field and suggest future research directions. It is believed that this work will be a valuable reference to researchers of lexical semantic relatedness and substantially support the research activities in this field.


meeting of the association for computational linguistics | 2007

UNIBA: JIGSAW algorithm for Word Sense Disambiguation

Pierpaolo Basile; Marco de Gemmis; Anna Lisa Gentile; Pasquale Lops; Giovanni Semeraro

Word Sense Disambiguation (WSD) is traditionally considered an AI-hard problem. A breakthrough in this field would have a significant impact on many relevant web-based applications, such as information retrieval and information extraction. This paper describes JIGSAW, a knowledge-based WSD system that attemps to disambiguate all words in a text by exploiting WordNet senses. The main assumption is that a specific strategy for each Part-Of-Speech (POS) is better than a single strategy. We evaluated the accuracy of JIGSAW on SemEval-2007 task 1 competition. This task is an application-driven one, where the application is a fixed cross-lingual information retrieval system. Participants disambiguate text by assigning WordNet synsets, then the system has to do the expansion to other languages, index the expanded documents and run the retrieval for all the languages in batch. The retrieval results are taken as a measure for the effectiveness of the disambiguation.


international conference on knowledge capture | 2013

Unsupervised wrapper induction using linked data

Anna Lisa Gentile; Ziqi Zhang; Isabelle Augenstein; Fabio Ciravegna

This work explores the usage of Linked Data for Web scale Information Extraction and shows encouraging results on the task of Wrapper Induction. We propose a simple knowledge based method which is (i) highly flexible with respect to different domains and (ii) does not require any training material, but exploits Linked Data as background knowledge source to build essential learning resources. The major contribution of this work is a study of how Linked Data - an imprecise, redundant and large-scale knowledge resource - can be used to support Web scale Information Extraction in an effective and efficient way and identify the challenges involved. We show that, for domains that are covered, Linked Data serve as a powerful knowledge resource for Information Extraction. Experiments on a publicly available dataset demonstrate that, under certain conditions, this simple unsupervised approach can achieve competitive results against some complex state of the art that always depends on training data.


text speech and dialogue | 2009

Improving Patient Opinion Mining through Multi-step Classification

Lei Xia; Anna Lisa Gentile; James Munro; José Iria

Automatically tracking attitudes, feelings and reactions in on-line forums, blogs and news is a desirable instrument to support statistical analyses by companies, the government, and even individuals. In this paper, we present a novel approach to polarity classification of short text snippets, which takes into account the way data are naturally distributed into several topics in order to obtain better classification models for polarity. Our approach is multi-step, where in the initial step a standard topic classifier is learned from the data and the topic labels, and in the ensuing step several polarity classifiers, one per topic, are learned from the data and the polarity labels. We empirically show that our approach improves classification accuracy over a real-world dataset by over 10%, when compared against a standard single-step approach using the same feature sets. The approach is applicable whenever training material is available for building both topic and polarity learning models.


international semantic web conference | 2013

Statistical Knowledge Patterns: Identifying Synonymous Relations in Large Linked Datasets

Ziqi Zhang; Anna Lisa Gentile; Eva Blomqvist; Isabelle Augenstein; Fabio Ciravegna

The Web of Data is a rich common resource with billions of triples available in thousands of datasets and individual Web documents created by both expert and non-expert ontologists. A common problem is the imprecision in the use of vocabularies: annotators can misunderstand the semantics of a class or property or may not be able to find the right objects to annotate with. This decreases the quality of data and may eventually hamper its usability over large scale. This paper describes Statistical Knowledge Patterns (SKP) as a means to address this issue. SKPs encapsulate key information about ontology classes, including synonymous properties in (and across) datasets, and are automatically generated based on statistical data analysis. SKPs can be effectively used to automatically normalise data, and hence increase recall in querying. Both pattern extraction and pattern usage are completely automated. The main benefits of SKPs are that: (1) their structure allows for both accurate query expansion and restriction; (2) they are context dependent, hence they describe the usage and meaning of properties in the context of a particular class; and (3) they can be generated offline, hence the equivalence among relations can be used efficiently at run time.


International Conference on Semantic Web Evaluation Challenge, ESWC 2015 | 2015

Open Knowledge Extraction Challenge

Andrea Giovanni Nuzzolese; Anna Lisa Gentile; Valentina Presutti; Aldo Gangemi; Darío Garigliotti; Roberto Navigli

The Open Knowledge Extraction (OKE) challenge is aimed at promoting research in the automatic extraction of structured content from textual data and its representation and publication as Linked Data. We designed two extraction tasks: (1) Entity Recognition, Linking and Typing and (2) Class Induction and entity typing. The challenge saw the participations of four systems: CETUS-FOX and FRED participating to both tasks, Adel participating to Task 1 and OAK@Sheffield participating to Task 2. In this paper we describe the OKE challenge, the tasks, the datasets used for training and evaluating the systems, the evaluation method, and obtained results.


italian research conference on digital library management systems | 2010

Semantic Relatedness Approach for Named Entity Disambiguation

Anna Lisa Gentile; Ziqi Zhang; Lei Xia; José Iria

Natural Language is a mean to express and discuss about concepts, objects, events, i.e., it carries semantic contents. One of the ultimate aims of Natural Language Processing techniques is to identify the meaning of the text, providing effective ways to make a proper linkage between textual references and their referents, that is, real world objects. This work addresses the problem of giving a sense to proper names in a text, that is, automatically associating words representing Named Entities with their referents. The proposed methodology for Named Entity Disambiguation is based on Semantic Relatedness Scores obtained with a graph based model over Wikipedia. We show that, without building a Bag of Words representation of the text, but only considering named entities within the text, the proposed paradigm achieves results competitive with the state of the art on two different datasets.


extended semantic web conference | 2013

Mapping Keywords to Linked Data Resources for Automatic Query Expansion

Isabelle Augenstein; Anna Lisa Gentile; Barry Norton; Ziqi Zhang; Fabio Ciravegna

Linked Data is a gigantic, constantly growing and extremely valuable resource, but its usage is still heavily dependent on (i) the familiarity of end users with RDF’s graph data model and its query language, SPARQL, and (ii) knowledge about available datasets and their contents. Intelligent keyword search over Linked Data is currently being investigated as a means to overcome these barriers to entry in a number of different approaches, including semantic search engines and the automatic conversion of natural language questions into structured queries. Our work addresses the specific challenge of mapping keywords to Linked Data resources, and proposes a novel method for this task. By exploiting the graph structure within Linked Data we determine which properties between resources are useful to discover, or directly express, semantic similarity. We also propose a novel scoring function to rank results. Experiments on a publicly available dataset show a 17% improvement in Mean Reciprocal Rank over the state of the art.


text speech and dialogue | 2014

Self Training Wrapper Induction with Linked Data

Anna Lisa Gentile; Ziqi Zhang; Fabio Ciravegna

This work explores the usage of Linked Data for Web scale Information Extraction, with focus on the task of Wrapper Induction. We show how to effectively use Linked Data to automatically generate training material and build a self-trained Wrapper Induction method. Experiments on a publicly available dataset demonstrate that for covered domains, our method can achieve F measure of 0.85, which is a competitive result compared against a supervised solution.


international semantic web conference | 2016

Conference Linked Data: The ScholarlyData Project

Andrea Giovanni Nuzzolese; Anna Lisa Gentile; Valentina Presutti; Aldo Gangemi

The Semantic Web Dog Food (SWDF) is the reference linked dataset of the Semantic Web community about papers, people, organisations, and events related to its academic conferences. In this paper we analyse the existing problems of generating, representing and maintaining Linked Data for the SWDF. With this work (i) we provide a refactored and cleaned SWDF dataset; (ii) we use a novel data model which improves the Semantic Web Conference Ontology, adopting best ontology design practices and (iii) we provide an open source workflow to support a healthy growth of the dataset beyond the Semantic Web conferences.

Collaboration


Dive into the Anna Lisa Gentile's collaboration.

Top Co-Authors

Avatar

Ziqi Zhang

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge