Diego De Cao
University of Rome Tor Vergata
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Diego De Cao.
empirical methods in natural language processing | 2008
Marco Pennacchiotti; Diego De Cao; Roberto Basili; Danilo Croce; Michael Roth
Most attempts to integrate FrameNet in NLP systems have so far failed because of its limited coverage. In this paper, we investigate the applicability of distributional and WordNet-based models on the task of lexical unit induction, i.e. the expansion of FrameNet with new lexical units. Experimental results show that our distributional and WordNet-based models achieve good level of accuracy and coverage, especially when combined.
Semantics in Text Processing. STEP 2008 Conference Proceedings | 2008
Diego De Cao; Danilo Croce; Marco Pennacchiotti; Roberto Basili
Models of lexical semantics are core paradigms in most NLP applications, such as dialogue, information extraction and document understanding. Unfortunately, the coverage of currently available resources (e.g. FrameNet) is still unsatisfactory. This paper presents a largely applicable approach for extending frame semantic resources, combining word sense information derived from WordNet and corpus-based distributional information. We report a large scale evaluation over the English FrameNet, and results on extending FrameNet to the Italian language, as the basis of the development of a full FrameNet for Italian.
international conference on computational linguistics | 2009
Roberto Basili; Diego De Cao; Danilo Croce; Bonaventura Coppola; Alessandro Moschitti
Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here investigated against the Europarl corpus. Results suggest that the quality of the derived annotations is surprisingly good and well suited for training semantic role labeling systems.
congress of the italian association for artificial intelligence | 2007
Roberto Basili; Diego De Cao; Cristina Giannone; Paolo Marocco
In this paper, a light framework for dialogue based interactive question answering is presented. The resulting architecture is called REQUIRE(Robust Empirical QUestion answering for Intelligent Retrieval), and represents a flexible and adaptive platform for domain specific dialogue. REQUIRE characterizes as a domain-driven dialogue system, whose aim is to support the specific tasks evoked by interactive question answering scenarios. Among its benefits it should be mentioned its modularityand portabilityacross different domains, its robustnessthrough adaptive models of speech act recognition and planning and its adherence of knowledge representation standard. The framework will be exemplified through its application within a sexual health information service tailored to young people.
International Workshop on Evaluation of Natural Language and Speech Tools for Italian, EVALITA 2011 | 2013
Roberto Basili; Diego De Cao; Alessandro Lenci; Alessandro Moschitti; Giulia Venturi
The Frame Labeling over Italian Texts (FLaIT) task held within the EvalIta 2011 challenge is here described. It focuses on the automatic annotation of free texts according to frame semantics. Systems were asked to label all semantic frames and their arguments, as evoked by predicate words occurring in plain text sentences. Proposed systems are based on a variety of learning techniques and achieve very good results, over 80% of accuracy, in most subtasks.
international conference on move to meaningful internet systems | 2007
Roberto Basili; Diego De Cao; Cristina Giannone
This paper proposes a model for ontological representation supporting task-oriented dialog. The adoption of our ontology representation allows to map an interactive Question Answering (iQA) task into a knowledge based process. It supports dialog control, speech act recognition, planning and natural language generation through a unified knowledge model. A platform for developing iQA systems in specific domains, called REQUIRE(Robust Empirical QUestion answering for Intelligent Retrieval), has been entirely developed over this model. The first prototype developed for medical consulting in the sexual health domain has been recently deployed and is currently under testing. This will serve as a basis for exemplifying the model and discussing its benefits.
web intelligence | 2009
Roberto Basili; Danilo Croce; Diego De Cao; Cristina Giannone
An ontology learning method, based on large scale linguistic ontologies, such as FrameNet and WordNet, is here discussed. A robust learning method is defined to assign semantic roles to domain specific grammatical patterns according to distributional models of lexical semantics. Large scale experimental results over an IE task show an accuracy around 85-90%.
Lecture Notes in Computer Science | 2016
Giuseppe Castellucci; Danilo Croce; Diego De Cao; Roberto Basili
The huge variability of trends, community interests and jargon is a crucial challenge for the application of language technologies to Social Media analysis. Models, such as grammars and lexicons, are exposed to rapid obsolescence, due to the speed at which topics as well as slogans change during time. In Sentiment Analysis, several works dynamically acquire the so-called opinionated lexicons. These are dictionaries where information regarding subjectivity aspects of individual words are described. This paper proposes an architecture for dynamic sentiment analysis over Twitter, combining structured learning and lexicon acquisition. Evidence about the beneficial effects of a dynamic architecture is reported through large scale tests over Twitter streams in Italian.
international conference on computational linguistics | 2010
Roberto Basili; Danilo Croce; Cristina Giannone; Diego De Cao
Techniques for the automatic acquisition of Information Extraction Pattern are still a crucial issue in knowledge engineering. A semi supervised learning method, based on large scale linguistic resources, such as FrameNet and WordNet, is discussed. In particular, a robust method for assigning conceptual relations (i.e. roles) to relevant grammatical structures is defined according to distributional models of lexical semantics over a large scale corpus. Experimental results show that the use of the resulting knowledge base provide significant results, i.e. correct interpretations for about 90% of the covered sentences. This confirms the impact of the proposed approach on the quality and development time of large scale IE systems.
workshop on image analysis for multimedia interactive services | 2009
Diego De Cao; Roberto Basili; Riccardo Petitti
An image classification model is here presented based on the integration of visual and textual properties supported by complex kernel functions. Linguistic descriptions derived through Information Extraction from Web pages are here integrated with the visual features corresponding to the images, according to independent kernel combinations. The impact of dimensionality reduction methods (i.e. LSA) and of proper combinations of redundant feature descriptions is also presented. The resulting workflow is largely applicable as the comparative evaluation discussed here confirms.