Felice Dell'Orletta
University of Pisa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Felice Dell'Orletta.
north american chapter of the association for computational linguistics | 2009
Giuseppe Attardi; Felice Dell'Orletta
Deterministic transition-based Shift/Reduce dependency parsers make often mistakes in the analysis of long span dependencies (McDonald & Nivre, 2007).
international world wide web conferences | 2015
Stefano Cresci; Maurizio Tesconi; Andrea Cimino; Felice Dell'Orletta
This work focuses on the analysis of Italian social media messages for disaster management and aims at the detection of messages carrying critical information for the damage assessment task. A main novelty of this study consists in the focus on out-domain and cross-event damage detection, and on the investigation of the most relevant tweet-derived features for these tasks. We devised different experiments by resorting to a wide set of linguistic features qualifying the lexical and grammatical structure of a text as well as ad-hoc features specifically implemented for this task. We investigated the most effective features that allow to achieve the best results. A further result of this study is the construction of the first manually annotated Italian corpus of social media messages for damage assessment.
Proceedings of the 9th Web as Corpus Workshop (WaC-9) | 2014
Verena Lyding; Egon W. Stemle; Claudia Borghetti; Marco Brunello; Sara Castagnoli; Felice Dell'Orletta; Henrik Dittmann; Alessandro Lenci; Vito Pirrelli
PAISA is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
conference on computational natural language learning | 2008
Massimiliano Ciaramita; Giuseppe Attardi; Felice Dell'Orletta; Mihai Surdeanu
This paper describes the DeSRL system, a joined effort of Yahoo! Research Barcelona and Universita di Pisa for the CoNLL-2008 Shared Task (Surdeanu et al., 2008). The system is characterized by an efficient pipeline of linear complexity components, each carrying out a different sub-task. Classifier errors and ambiguities are addressed with several strategies: revision models, voting, and reranking. The system participated in the closed challenge ranking third in the complete problem evaluation with the following scores: 82.06 labeled macro F1 for the overall task, 86.6 labeled attachment for syntactic dependencies, and 77.5 labeled F1 for semantic dependencies.
meeting of the association for computational linguistics | 2005
Felice Dell'Orletta; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli
In this paper, we discuss an application of Maximum Entropy to modeling the acquisition of subject and object processing in Italian. The model is able to learn from corpus data a set of experimentally and theoretically well-motivated linguistic constraints, as well as their relative salience in Italian grammar development and processing. The model is also shown to acquire robust syntactic generalizations by relying on the evidence provided by a small number of high token frequency verbs only. These results are consistent with current research focusing on the role of high frequency verbs in allowing children to converge on the most salient constraints in the grammar.
workshop on innovative use of nlp for building educational applications | 2014
Felice Dell'Orletta; Martijn Wieling; Giulia Venturi; Andrea Cimino; Simonetta Montemagni
The paper investigates the problem of sentence readability assessment, which is modelled as a classification task, with a specific view to text simplification. In particular, it addresses two open issues connected with it, i.e. the corpora to be used for training, and the identification of the most effective features to determine sentence readability. An existing readability assessment tool developed for Italian was specialized at the level of training corpus and learning algorithm. A maximum entropy–based feature selection and ranking algorithm (grafting) was used to identify to the most relevant features: it turned out that assessing the readability of sentences is a complex task, requiring a high number of features, mainly syntactic ones.
Intelligenza Artificiale | 2015
Giuseppe Attardi; Valerio Basile; Cristina Bosco; Tommaso Caselli; Felice Dell'Orletta; Simonetta Montemagni; Viviana Patti; Maria Simi; Rachele Sprugnoli
Shared task evaluation campaigns represent a well established form of competitive evaluation, an important opportunity to propose and tackle new challenges for a specific research area and a way to foster the development of benchmarks, tools and resources. The advantages of this approach are evident in any experimental field, including the area of Natural Language Processing. An outlook on state–of–the–art language technologies for Italian can be obtained by reflecting on the results of the recently held workshop “Evaluation of NLP and Speech Tools for Italian”, EVALITA 2014. The motivations underlying individual shared tasks, the level of knowledge and development achieved within each of them, the impact on applications, society and economy at large as well as directions for future research will be discussed from this perspective.
international multiconference on computer science and information technology | 2009
Tommaso Caselli; Felice Dell'Orletta; Irina Prodanof
In this paper we will present an ongoing research on the development of a temporal expression tagger and normalizer for Italian, compliant with the TimeML specifications. Similarly to other existing temporal expression taggers, the system is rule-based and benefits from an extensive corpus study to identify the reserved time words. However, it differs from other systems since it implements WordNet-based semantic relations between temporal expressions in order to improve its accuracy. So far, the system reports an F-measure of 86.41% for the subtask of temporal expression detection and bracketing.
AIDA informazioni | 2008
Felice Dell'Orletta; Alessandro Lenci; Simone Marchi; Simonetta Montemagni; Vito Pirrelli
The paper focuses on the automatic extraction of domain knowledge from Italian legal texts and presents a fully-implemented ontology learning system (T2K, Text-2-Knowledge) that includes a battery of tools for Natural Language Processing, statistical text analysis and machine learning. Evaluated results show the considerable potential of systems like T2K, exploiting an incremental interleaving of NLP and machine learning techniques for accurate large-scale semi-automatic extraction and structuring of domain-specific knowledge.
meeting of the association for computational linguistics | 2006
Felice Dell'Orletta; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli
The paper reports on a detailed quantitative analysis of distributional language data of both Italian and Czech, highlighting the relative contribution of a number of distributed grammatical factors to sentence-based identification of subjects and direct objects. The work uses a Maximum Entropy model of stochastic resolution of conflicting grammatical constraints and is demonstrably capable of putting explanatory theoretical accounts to the test of usage-based empirical verification.