Danilo Croce | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Danilo Croce is active.

Explore More

Publication

Featured researches published by Danilo Croce.

empirical methods in natural language processing | 2008

Automatic induction of FrameNet lexical units

Marco Pennacchiotti; Diego De Cao; Roberto Basili; Danilo Croce; Michael Roth

Most attempts to integrate FrameNet in NLP systems have so far failed because of its limited coverage. In this paper, we investigate the applicability of distributional and WordNet-based models on the task of lexical unit induction, i.e. the expansion of FrameNet with new lexical units. Experimental results show that our distributional and WordNet-based models achieve good level of accuracy and coverage, especially when combined.

Semantics in Text Processing. STEP 2008 Conference Proceedings | 2008

Combining Word Sense and Usage for Modeling Frame Semantics

Diego De Cao; Danilo Croce; Marco Pennacchiotti; Roberto Basili

Models of lexical semantics are core paradigms in most NLP applications, such as dialogue, information extraction and document understanding. Unfortunately, the coverage of currently available resources (e.g. FrameNet) is still unsatisfactory. This paper presents a largely applicable approach for extending frame semantic resources, combining word sense information derived from WordNet and corpus-based distributional information. We report a large scale evaluation over the English FrameNet, and results on extending FrameNet to the Italian language, as the basis of the development of a full FrameNet for Italian.

meeting of the association for computational linguistics | 2015

KeLP: a Kernel-based Learning Platform for Natural Language Processing

Simone Filice; Giuseppe Castellucci; Danilo Croce; Roberto Basili

Kernel-based learning algorithms have been shown to achieve state-of-the-art results in many Natural Language Processing (NLP) tasks. We present KELP, a Java framework that supports the implementation of both kernel-based learning algorithms and kernel functions over generic data representation, e.g. vectorial data or discrete structures. The framework has been designed to decouple kernel functions and learning algorithms: once a new kernel function has been implemented it can be adopted in all the available kernelmachine algorithms. The platform includes different Online and Batch Learning algorithms for Classification, Regression and Clustering, as well as several Kernel functions, ranging from vector-based to structural kernels. This paper will show the main aspects of the framework by applying it to different NLP tasks.

international conference on computational linguistics | 2009

Cross-Language Frame Semantics Transfer in Bilingual Corpora

Roberto Basili; Diego De Cao; Danilo Croce; Bonaventura Coppola; Alessandro Moschitti

Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here investigated against the Europarl corpus. Results suggest that the quality of the derived annotations is surprisingly good and well suited for training semantic role labeling systems.

north american chapter of the association for computational linguistics | 2016

KeLP at SemEval-2016 Task 3: Learning Semantic Relations between Questions and Answers.

Simone Filice; Danilo Croce; Alessandro Moschitti; Roberto Basili

This paper describes the KeLP system participating in the SemEval-2016 Community Question Answering (cQA) task. The challenge tasks are modeled as binary classification problems: kernel-based classifiers are trained on the SemEval datasets and their scores are used to sort the instances and produce the final ranking. All classifiers and kernels have been implemented within the Kernel-based Learning Platform called KeLP. Our primary submission ranked first in Subtask A, third in Subtask B and second in Subtask C. These ranks are based on MAP, which is the referring challenge system score. Our approach outperforms all the other systems with respect to all the other challenge metrics.

Intelligenza Artificiale | 2012

Structured learning for semantic role labeling

Danilo Croce; Giuseppe Castellucci; Emanuele Bastianelli

The use of complex grammatical features in statistical language learning assumes the availability of large scale training data and good quality parsers, especially for language different from English. In this paper, we show how good quality FrameNet SRL systems can be obtained, without relying on full syntactic parsing, by backing off to surface grammatical representations and structured learning. This model is here shown to achieve state-of-art results in standard benchmarks, while its robustness is confirmed in poor training conditions, for a language different for English, i.e. Italian. 1 Linguistic Features for Inductive Tasks Language learning systems usually generalize linguistic observations into statistical models of higher level semantic tasks, such as Semantic Role Labeling (SRL). Statistical learning methods assume that lexical or grammatical aspects of training data are the basic features for modeling the different inferences. They are then generalized into predictive patterns composing the final induced model. Lexical information captures semantic information and fine grained context dependent aspects of the input data. However, it is largely affected by data sparseness as lexical evidence is often poorly represented in training. It is also difficult to be generalized and non scalable, as the development large scale lexical KBs is very expensive. Moreover, other crucial properties, such as word ordering, are neglected by lexical representations, as syntax must be also properly addressed. In semantic role labeling, the role of grammatical features has been outlined since the seminal work by [6]. Symbolic expressions derived from the parse trees denote the position and the relationship between an argument and its predicate, and they are used as features. Parse tree paths are such features, employed in [11] for semantic role labeling. Tree kernels, introduced by [4], model similarity between two training examples as a function of the shared parts of their parse trees. Applied to different tasks, from parsing [4] to semantic role labeling [16], tree kernels determine expressive representations for effective grammatical feature engineering. However, there is no free lunch in the adoption of lexical and grammatical features in complex NLP tasks. First, lexical information is hard to be properly generalized whenever the amount of training data is small. Large scale general-purpose lexicons are available, but their employment in specific tasks is not satisfactory: coverage in domain (or corpus)-specific tasks is often poor and domain adaptation is difficult. For R. Pirrone and F. Sorbello (Eds.): AI*IA 2011, LNAI 6934, pp. 238–249, 2011. c

european conference on artificial intelligence | 2014

Effective and robust natural language understanding for human-robot interaction

Emanuele Bastianelli; Giuseppe Castellucci; Danilo Croce; Roberto Basili; Daniele Nardi

Robots are slowly becoming part of everyday life, as they are being marketed for commercial applications (viz. telepresence, cleaning or entertainment). Thus, the ability to interact with non-expert users is becoming a key requirement. Even if user utterances can be efficiently recognized and transcribed by Automatic Speech Recognition systems, several issues arise in translating them into suitable robotic actions. In this paper, we will discuss both approaches providing two existing Natural Language Understanding workflows for Human Robot Interaction. First, we discuss a grammar based approach: it is based on grammars thus recognizing a restricted set of commands. Then, a data driven approach, based on a free-from speech recognizer and a statistical semantic parser, is discussed. The main advantages of both approaches are discussed, also from an engineering perspective, i.e. considering the effort of realizing HRI systems, as well as their reusability and robustness. An empirical evaluation of the proposed approaches is carried out on several datasets, in order to understand performances and identify possible improvements towards the design of NLP components in HRI.

conference on information and knowledge management | 2014

Semantic Compositionality in Tree Kernels

Paolo Annesi; Danilo Croce; Roberto Basili

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

applications of natural language to data bases | 2015

Acquiring a Large Scale Polarity Lexicon Through Unsupervised Distributional Methods

Giuseppe Castellucci; Danilo Croce; Roberto Basili

The recent interests in Sentiment Analysis systems brought the attention on the definition of effective methods to detect opinions and sentiments in texts with a good accuracy. Many approaches that can be found in literature are based on hand-coded resources that model the prior polarity of words or multi-word expressions. The construction of such resources is in general expensive and coverage issues arise with respect to the multiplicity of linguistic phenomena of sentiment expressions. This paper presents an automatic method for deriving a large-scale polarity lexicon based on Distributional Models of lexical semantics. Given a set of sentences annotated with polarity, we transfer the sentiment information from sentences to words. The set of annotated examples is derived from Twitter and the polarity assignment to sentences is derived by simple heuristics. The approach is mostly unsupervised, and the experimental evaluation carried out on two Sentiment Analysis tasks shows the benefits of the generated resource.

international conference on computational linguistics | 2014

UNITOR: Aspect Based Sentiment Analysis with Structured Learning

Giuseppe Castellucci; Simone Filice; Danilo Croce; Roberto Basili

In this paper, the UNITOR system participating in the SemEval-2014 Aspect Based Sentiment Analysis competition is presented. The task is tackled exploiting Kernel Methods within the Support Vector Machine framework. The Aspect Term Extraction is modeled as a sequential tagging task, tackled through SVM hmm . The Aspect Term Polarity, Aspect Category and Aspect Category Polarity detection are tackled as a classification problem where multiple kernels are linearly combined to generalize several linguistic information. In the challenge, UNITOR system achieves good results, scoring in almost all rankings between the 2 nd and the 8 th position within about 30 competitors.

Explore More