Diego De Cao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Diego De Cao is active.

Explore More

Publication

Featured researches published by Diego De Cao.

empirical methods in natural language processing | 2008

Automatic induction of FrameNet lexical units

Marco Pennacchiotti; Diego De Cao; Roberto Basili; Danilo Croce; Michael Roth

Most attempts to integrate FrameNet in NLP systems have so far failed because of its limited coverage. In this paper, we investigate the applicability of distributional and WordNet-based models on the task of lexical unit induction, i.e. the expansion of FrameNet with new lexical units. Experimental results show that our distributional and WordNet-based models achieve good level of accuracy and coverage, especially when combined.

Semantics in Text Processing. STEP 2008 Conference Proceedings | 2008

Combining Word Sense and Usage for Modeling Frame Semantics

Diego De Cao; Danilo Croce; Marco Pennacchiotti; Roberto Basili

Models of lexical semantics are core paradigms in most NLP applications, such as dialogue, information extraction and document understanding. Unfortunately, the coverage of currently available resources (e.g. FrameNet) is still unsatisfactory. This paper presents a largely applicable approach for extending frame semantic resources, combining word sense information derived from WordNet and corpus-based distributional information. We report a large scale evaluation over the English FrameNet, and results on extending FrameNet to the Italian language, as the basis of the development of a full FrameNet for Italian.

international conference on computational linguistics | 2009

Cross-Language Frame Semantics Transfer in Bilingual Corpora

Roberto Basili; Diego De Cao; Danilo Croce; Bonaventura Coppola; Alessandro Moschitti

Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here investigated against the Europarl corpus. Results suggest that the quality of the derived annotations is surprisingly good and well suited for training semantic role labeling systems.

congress of the italian association for artificial intelligence | 2007

Data-Driven Dialogue for Interactive Question Answering

Roberto Basili; Diego De Cao; Cristina Giannone; Paolo Marocco

In this paper, a light framework for dialogue based interactive question answering is presented. The resulting architecture is called REQUIRE(Robust Empirical QUestion answering for Intelligent Retrieval), and represents a flexible and adaptive platform for domain specific dialogue. REQUIRE characterizes as a domain-driven dialogue system, whose aim is to support the specific tasks evoked by interactive question answering scenarios. Among its benefits it should be mentioned its modularityand portabilityacross different domains, its robustnessthrough adaptive models of speech act recognition and planning and its adherence of knowledge representation standard. The framework will be exemplified through its application within a sexual health information service tailored to young people.

International Workshop on Evaluation of Natural Language and Speech Tools for Italian, EVALITA 2011 | 2013

EvalIta 2011: The frame labeling over Italian texts task

Roberto Basili; Diego De Cao; Alessandro Lenci; Alessandro Moschitti; Giulia Venturi

The Frame Labeling over Italian Texts (FLaIT) task held within the EvalIta 2011 challenge is here described. It focuses on the automatic annotation of free texts according to frame semantics. Systems were asked to label all semantic frames and their arguments, as evoked by predicate words occurring in plain text sentences. Proposed systems are based on a variety of learning techniques and achieve very good results, over 80% of accuracy, in most subtasks.

international conference on move to meaningful internet systems | 2007

Ontological modeling for interactive question answering

Roberto Basili; Diego De Cao; Cristina Giannone

This paper proposes a model for ontological representation supporting task-oriented dialog. The adoption of our ontology representation allows to map an interactive Question Answering (iQA) task into a knowledge based process. It supports dialog control, speech act recognition, planning and natural language generation through a unified knowledge model. A platform for developing iQA systems in specific domains, called REQUIRE(Robust Empirical QUestion answering for Intelligent Retrieval), has been entirely developed over this model. The first prototype developed for medical consulting in the sexual health domain has been recently deployed and is currently under testing. This will serve as a basis for exemplifying the model and discussing its benefits.

web intelligence | 2009

Learning Semantic Roles for Ontology Patterns

Roberto Basili; Danilo Croce; Diego De Cao; Cristina Giannone

An ontology learning method, based on large scale linguistic ontologies, such as FrameNet and WordNet, is here discussed. A robust learning method is defined to assign semantic roles to domain specific grammatical patterns according to distributional models of lexical semantics. Large scale experimental results over an IE task show an accuracy around 85-90%.

Lecture Notes in Computer Science | 2016

User Mood Tracking for Opinion Analysis on Twitter

Giuseppe Castellucci; Danilo Croce; Diego De Cao; Roberto Basili

The huge variability of trends, community interests and jargon is a crucial challenge for the application of language technologies to Social Media analysis. Models, such as grammars and lexicons, are exposed to rapid obsolescence, due to the speed at which topics as well as slogans change during time. In Sentiment Analysis, several works dynamically acquire the so-called opinionated lexicons. These are dictionaries where information regarding subjectivity aspects of individual words are described. This paper proposes an architecture for dynamic sentiment analysis over Twitter, combining structured learning and lexicon acquisition. Evidence about the beneficial effects of a dynamic architecture is reported through large scale tests over Twitter streams in Italian.

international conference on computational linguistics | 2010

Acquiring IE patterns through distributional lexical semantic models

Roberto Basili; Danilo Croce; Cristina Giannone; Diego De Cao

Techniques for the automatic acquisition of Information Extraction Pattern are still a crucial issue in knowledge engineering. A semi supervised learning method, based on large scale linguistic resources, such as FrameNet and WordNet, is discussed. In particular, a robust method for assigning conceptual relations (i.e. roles) to relevant grammatical structures is defined according to distributional models of lexical semantics over a large scale corpus. Experimental results show that the use of the resulting knowledge base provide significant results, i.e. correct interpretations for about 90% of the covered sentences. This confirms the impact of the proposed approach on the quality and development time of large scale IE systems.

workshop on image analysis for multimedia interactive services | 2009

Automatic multimedia annotation through kernel combinations

Diego De Cao; Roberto Basili; Riccardo Petitti

An image classification model is here presented based on the integration of visual and textual properties supported by complex kernel functions. Linguistic descriptions derived through Information Extraction from Web pages are here integrated with the visual features corresponding to the images, according to independent kernel combinations. The impact of dimensionality reduction methods (i.e. LSA) and of proper combinations of redundant feature descriptions is also presented. The resulting workflow is largely applicable as the comparative evaluation discussed here confirms.

Explore More