Vito Pirrelli
National Research Council
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vito Pirrelli.
Lecture Notes in Computer Science | 2004
Roberto Bartolini; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli; Claudia Soria
In this paper we address the problem of automatically enriching legal texts with semantic annotation, an essential pre–requisite to effective indexing and retrieval of legal documents. This is done through illustration of SALEM (Semantic Annotation for LEgal Management), a computational system developed for automated semantic annotation of (Italian) law texts. SALEM is an incremental system using Natural Language Processing techniques to perform two tasks: i) classify law paragraphs according to their regulatory content, and ii) extract relevant text fragments corresponding to specific semantic roles that are relevant for the different types of regulatory content. The paper sketches the overall architecture of SALEM and reports results of a preliminary case study on a sample of Italian law texts.
Proceedings of the 9th Web as Corpus Workshop (WaC-9) | 2014
Verena Lyding; Egon W. Stemle; Claudia Borghetti; Marco Brunello; Sara Castagnoli; Felice Dell'Orletta; Henrik Dittmann; Alessandro Lenci; Vito Pirrelli
PAISA is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
meeting of the association for computational linguistics | 2005
Felice Dell'Orletta; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli
In this paper, we discuss an application of Maximum Entropy to modeling the acquisition of subject and object processing in Italian. The model is able to learn from corpus data a set of experimentally and theoretically well-motivated linguistic constraints, as well as their relative salience in Italian grammar development and processing. The model is also shown to acquire robust syntactic generalizations by relying on the evidence provided by a small number of high token frequency verbs only. These results are consistent with current research focusing on the role of high frequency verbs in allowing children to converge on the most salient constraints in the grammar.
international conference on computational linguistics | 2002
Roberto Bartolini; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli
In the paper we report a qualitative evaluation of the performance of a dependency analyser of Italian that runs in both a non-lexicalised and a lexicalised mode. Results shed light on the contribution of types of lexical information to parsing.
Natural Language Engineering | 1999
Stefano Federici; Simonetta Montemagni; Vito Pirrelli
The paper describes SENSE, a word sense disambiguation system which makes use of multidimensional analogy-based proportions to infer the most likely sense of a word given its context. Architecture and functioning of the system are illustrated in detail. Results of different experimental settings are given, showing that the system, in spite its conservative bias, successfully copes with the problem of training data sparseness.
Lingue e linguaggio | 2012
Claudia Marzi; Marcello Ferro; Vito Pirrelli
The variety of morphological processes attested in and combinations thereof pose severe problems to unsupervised algorithms of morphology induction. The paper analyses morphological strategies for word recoding. Our model endorses the hypothesis that lexical forms are memorised as full units. At the same time, lexical units are paradigmatically organised. We show that the overall amount of redundant morphological structure emerging from paradigm-based self-organisation has a clear impact on generalisation. This supports the view that issues of word representation and issues of word processing are .
Lingue e linguaggio | 2010
Marcello Ferro; Giovanni Pezzulo; Vito Pirrelli
Recent experimental evidence on morphological learning and processing has prompted a less deterministic and modular view of the interaction between stored word knowledge and on-line processing. Storing a word in the mental lexicon does not simply entail keeping a faithful memory image of that word in the most compact way. It also requires encoding and manipulating such image through topological structures that are optimally adapted to word production and comprehension. Temporal Self-Organizing Maps (THSOMs) are a novel model of artificial neural network that keeps time serial information through predictive activation chains of receptors encoding both spatial and temporal information of input stimuli. The impact of this model on issues of lexical organization and morphological processing is investigated in detail through a series of simulations shedding light on the dynamics between short-term memory (activation), long-term memory (learning) and morphological organization of stored word forms (topology).
meeting of the association for computational linguistics | 2004
Vito Pirrelli; Ivan Herreros; Basilio Calderone; Michele Virgilio
The paper reports on the behaviour of a Kohonen map of the mental lexicon, monitored through different phases of acquisition of the Italian verb system. Reported experiments appear to consistently reproduce emergent global ordering constraints on memory traces of inflected verb forms, developed through principles of local interactions between parallel processing neurons.
Computers and The Humanities | 2000
Stefano Federici; Simonetta Montemagni; Vito Pirrelli
The paper describes SENSE, a word sense disambiguation system thatmakes use of different types of cues to infer the most likelysense of a word given its context. Architecture and functioning ofthe system are briefly illustrated. Results are given for theROMANSEVAL Italian test corpus of verbs.
international joint conference on artificial intelligence | 1996
Stefano Federici; Vito Pirrelli; Francois Yvon
When looked at from a multilingual perspective, grapheme-to-phoneme conversion is a challenging task, fraught with most of the classical NLP ”vexed questions”: bottle-neck problem of data acquisition, pervasiveness of exceptions, difficulty to state range and order of rule application, proper treatment of context-sensitive phenomena and long-distance dependencies, and so on. The hand-crafting of transcription rules by a human expert is onerous and time-consuming, and yet, for some European languages, still stops short of a level of correctness and accuracy acceptable for practical applications. We illustrate here a self-learning multilingual system for analogy-based pronunciation which was tested on Italian, English and French, and whose performances are assessed against the output of both statistically and rule-based transcribers. The general point is made that analogy-based self-learning techniques are no longer just psycholinguistically-plausible models, but competitive tools, combining the advantages of using language-independent, self-learning, tractable algorithms, with the welcome bonus of being more reliable for applications than traditional text-to-speech systems.