Claudio Giuliano | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Claudio Giuliano is active.

Explore More

Publication

Featured researches published by Claudio Giuliano.

ACM Transactions on Speech and Language Processing | 2007

Relation extraction and the influence of automatic named-entity recognition

Claudio Giuliano; Alberto Lavelli; Lorenza Romano

We present an approach for extracting relations between named entities from natural language documents. The approach is based solely on shallow linguistic processing, such as tokenization, sentence splitting, part-of-speech tagging, and lemmatization. It uses a combination of kernel functions to integrate two different information sources: (i) the whole sentence where the relation appears, and (ii) the local contexts around the interacting entities. We present the results of experiments on extracting five different types of relations from a dataset of newswire documents and show that each information source provides a useful contribution to the recognition task. Usually the combined kernel significantly increases the precision with respect to the basic kernels, sometimes at the cost of a slightly lower recall. Moreover, we performed a set of experiments to assess the influence of the accuracy of named-entity recognition on the performance of the relation-extraction algorithm. Such experiments were performed using both the correct named entities (i.e., those manually annotated in the corpus) and the noisy named entities (i.e., those produced by a machine learning-based named-entity recognizer). The results show that our approach significantly improves the previous results obtained on the same dataset.

meeting of the association for computational linguistics | 2007

FBK-irst: Lexical Substitution Task Exploiting Domain and Syntagmatic Coherence

Claudio Giuliano; Alfio Massimiliano Gliozzo; Carlo Strapparava

This paper summarizes FBK-irst participation at the lexical substitution task of the Semeval competition. We submitted two different systems, both exploiting synonym lists extracted from dictionaries. For each word to be substituted, the systems rank the associated synonym list according to a similarity metric based on Latent Semantic Analysis and to the occurrences in the Web 1T 5-gram corpus, respectively. In particular, the latter system achieves the state-of-the-art performance, largely surpassing the baseline proposed by the organizers.

international conference on computational linguistics | 2008

Instance-Based Ontology Population Exploiting Named-Entity Substitution

Claudio Giuliano; Alfio Massimiliano Gliozzo

We present an approach to ontology population based on a lexical substitution technique. It consists in estimating the plausibility of sentences where the named entity to be classified is substituted with the ones contained in the training data, in our case, a partially populated ontology. Plausibility is estimated by using Web data, while the classification algorithm is instance-based. We evaluated our method on two different ontology population tasks. Experiments show that our solution is effective, outperforming existing methods, and it can be applied to practical ontology population problems.

extended semantic web conference | 2013

Automatic Expansion of DBpedia Exploiting Wikipedia Cross-Language Information

Alessio Palmero Aprosio; Claudio Giuliano; Alberto Lavelli

DBpedia is a project aiming to represent Wikipedia content in RDF triples. It plays a central role in the Semantic Web, due to the large and growing number of resources linked to it. Nowadays, only 1.7M Wikipedia pages are deeply classified in the DBpedia ontology, although the English Wikipedia contains almost 4M pages, showing a clear problem of coverage. In other languages (like French and Spanish) this coverage is even lower. The objective of this paper is to define a methodology to increase the coverage of DBpedia in different languages. The major problems that we have to solve concern the high number of classes involved in the DBpedia ontology and the lack of coverage for some classes in certain languages. In order to deal with these problems, we first extend the population of the classes for the different languages by connecting the corresponding Wikipedia pages through cross-language links. Then, we train a supervised classifier using this extended set as training data. We evaluated our system using a manually annotated test set, demonstrating that our approach can add more than 1M new entities to DBpedia with high precision (90%) and recall (50%). The resulting resource is available through a SPARQL endpoint and a downloadable package.

Computational Linguistics | 2009

Kernel methods for minimally supervised wsd

Claudio Giuliano; Alfio Massimiliano Gliozzo; Carlo Strapparava

We present a semi-supervised technique for word sense disambiguation that exploits external knowledge acquired in an unsupervised manner. In particular, we use a combination of basic kernel functions to independently estimate syntagmatic and domain similarity, building a set of word-expert classifiers that share a common domain model acquired from a large corpus of unlabeled data. The results show that the proposed approach achieves state-of-the-art performance on a wide range of lexical sample tasks and on the English all-words task of Senseval-3, although it uses a considerably smaller number of training examples than other methods.

meeting of the association for computational linguistics | 2007

FBK-IRST: Kernel Methods for Semantic Relation Extraction

Claudio Giuliano; Alberto Lavelli; Daniele Pighin; Lorenza Romano

We present an approach for semantic relation extraction between nominals that combines shallow and deep syntactic processing and semantic information using kernel methods. Two information sources are considered: (i) the whole sentence where the relation appears, and (ii) WordNet synsets and hypernymy relations of the candidate nominals. Each source of information is represented by kernel functions. In particular, five basic kernel functions are linearly combined and weighted under different conditions. The experiments were carried out using support vector machines as classifier. The system achieves an overall F1 of 71.8% on the Classification of Semantic Relations between Nominals task at SemEval-2007.

language resources and evaluation | 2008

Evaluation of machine learning-based information extraction algorithms: criticisms and recommendations

Alberto Lavelli; Mary Elaine Califf; Fabio Ciravegna; Dayne Freitag; Claudio Giuliano; Nicholas Kushmerick; Lorenza Romano; Neil Ireson

We survey the evaluation methodology adopted in information extraction (IE), as defined in a few different efforts applying machine learning (ML) to IE. We identify a number of critical issues that hamper comparison of the results obtained by different researchers. Some of these issues are common to other NLP-related tasks: e.g., the difficulty of exactly identifying the effects on performance of the data (sample selection and sample size), of the domain theory (features selected), and of algorithm parameter settings. Some issues are specific to IE: how leniently to assess inexact identification of filler boundaries, the possibility of multiple fillers for a slot, and how the counting is performed. We argue that, when specifying an IE task, these issues should be explicitly addressed, and a number of methodological characteristics should be clearly defined. To empirically verify the practical impact of the issues mentioned above, we perform a survey of the results of different algorithms when applied to a few standard datasets. The survey shows a serious lack of consensus on these issues, which makes it difficult to draw firm conclusions on a comparative evaluation of the algorithms. Our aim is to elaborate a clear and detailed experimental methodology and propose it to the IE community. Widespread agreement on this proposal should lead to future IE comparative evaluations that are fair and reliable. To demonstrate the way the methodology is to be applied we have organized and run a comparative evaluation of ML-based IE systems (the Pascal Challenge on ML-based IE) where the principles described in this article are put into practice. In this article we describe the proposed methodology and its motivations. The Pascal evaluation is then described and its results presented.

international semantic web conference | 2010

Supporting natural language processing with background knowledge: coreference resolution case

Volha Bryl; Claudio Giuliano; Luciano Serafini; Kateryna Tymoshenko

Systems based on statistical and machine learning methods have been shown to be extremely effective and scalable for the analysis of large amount of textual data. However, in the recent years, it becomes evident that one of the most important directions of improvement in natural language processing (NLP) tasks, like word sense disambiguation, coreference resolution, relation extraction, and other tasks related to knowledge extraction, is by exploiting semantics. While in the past, the unavailability of rich and complete semantic descriptions constituted a serious limitation of their applicability, nowadays, the Semantic Web made available a large amount of logically encoded information (e.g. ontologies, RDF(S)-data, linked data, etc.), which constitutes a valuable source of semantics. However, web semantics cannot be easily plugged into machine learning systems. Therefore the objective of this paper is to define a reference methodology for combining semantic information available in the web under the form of logical theories, with statistical methods for NLP. The major problems that we have to solve to implement our methodology concern (i) the selection of the correct and minimal knowledge among the large amount available in the web, (ii) the representation of uncertain knowledge, and (iii) the resolution and the encoding of the rules that combine knowledge retrieved from Semantic Web sources with semantics in the text. In order to evaluate the appropriateness of our approach, we present an application of the methodology to the problem of intra-document coreference resolution, and we show by means of some experiments on the standard dataset, how the injection of knowledge leads to the improvement of this task performance.

international semantic web conference | 2013

Towards an Automatic Creation of Localized Versions of DBpedia

Alessio Palmero Aprosio; Claudio Giuliano; Alberto Lavelli

DBpedia is a large-scale knowledge base that exploits Wikipedia as primary data source. The extraction procedure requires to manually map Wikipedia infoboxes into the DBpedia ontology. Thanks to crowdsourcing, a large number of infoboxes has been mapped in the English DBpedia. Consequently, the same procedure has been applied to other languages to create the localized versions of DBpedia. However, the number of accomplished mappings is still small and limited to most frequent infoboxes. Furthermore, mappings need maintenance due to the constant and quick changes of Wikipedia articles. In this paper, we focus on the problem of automatically mapping infobox attributes to properties into the DBpedia ontology for extending the coverage of the existing localized versions or building from scratch versions for languages not covered in the current version. The evaluation has been performed on the Italian mappings. We compared our results with the current mappings on a random sample re-annotated by the authors. We report results comparable to the ones obtained by a human annotator in term of precision, but our approach leads to a significant improvement in recall and speed. Specifically, we mapped 45,978 Wikipedia infobox attributes to DBpedia properties in 14 different languages for which mappings were not yet available. The resource is made available in an open format.

acm symposium on applied computing | 2012

A novel Framenet-based resource for the semantic web

Volha Bryl; Sara Tonelli; Claudio Giuliano; Luciano Serafini

FrameNet is a large-scale lexical resource encoding information about semantic frames (situations) and semantic roles. The aim of the paper is to enrich FrameNet by mapping the lexical fillers of semantic roles to WordNet using a Wikipedia-based detour. The applied methodology relies on a word sense disambiguation step, in which a Wikipedia page is assigned to a role filler, and then BabelNet and YAGO are used to acquire WordNet synsets for a filler. We show how to represent the acquired resource in OWL, linking it to the existing RDF/OWL representations of FrameNet and WordNet. Part of the resource is evaluated by matching it with the WordNet synsets manually assigned by FrameNet lexicographers to a subset of semantic roles.

Explore More