Stefano Faralli
Sapienza University of Rome
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefano Faralli.
international joint conference on artificial intelligence | 2011
Roberto Navigli; Paola Velardi; Stefano Faralli
In this paper we present a graph-based approach aimed at learning a lexical taxonomy automatically starting from a domain corpus and the Web. Unlike many taxonomy learning approaches in the literature, our novel algorithm learns both concepts and relations entirely from scratch via the automated extraction of terms, definitions and hypernyms. This results in a very dense, cyclic and possibly disconnected hypernym graph. The algorithm then induces a taxonomy from the graph. Our experiments show that we obtain high-quality results, both when building brand-new taxonomies and when reconstructing WordNet sub-hierarchies.
Computational Linguistics | 2013
Paola Velardi; Stefano Faralli; Roberto Navigli
In 2004 we published in this journal an article describing OntoLearn, one of the first systems to automatically induce a taxonomy from documents and Web sites. Since then, OntoLearn has continued to be an active area of research in our group and has become a reference work within the community. In this paper we describe our next-generation taxonomy learning methodology, which we name OntoLearn Reloaded. Unlike many taxonomy learning approaches in the literature, our novel algorithm learns both concepts and relations entirely from scratch via the automated extraction of terms, definitions, and hypernyms. This results in a very dense, cyclic and potentially disconnected hypernym graph. The algorithm then induces a taxonomy from this graph via optimal branching and a novel weighting policy. Our experiments show that we obtain high-quality results, both when building brand-new taxonomies and when reconstructing sub-hierarchies of existing taxonomies.
conference on information and knowledge management | 2011
Roberto Navigli; Stefano Faralli; Aitor Soroa; Oier Lopez de Lacalle; Eneko Agirre
In this paper we present a novel approach to learning semantic models for multiple domains, which we use to categorize Wikipedia pages and to perform domain Word Sense Disambiguation (WSD). In order to learn a semantic model for each domain we first extract relevant terms from the texts in the domain and then use these terms to initialize a random walk over the WordNet graph. Given an input text, we check the semantic models, choose the appropriate domain for that text and use the best-matching model to perform WSD. Our results show considerable improvements on text categorization and domain WSD tasks.
north american chapter of the association for computational linguistics | 2015
Georgeta Bordea; Paul Buitelaar; Stefano Faralli; Roberto Navigli
This paper describes the first shared task on Taxonomy Extraction Evaluation organised as part of SemEval-2015. Participants were asked to find hypernym-hyponym relations between given terms. For each of the four selected target domains the participants were provided with two lists of domainspecific terms: a WordNet collection of terms and a well-known terminology extracted from an online publicly available taxonomy. A total of 45 taxonomies submitted by 6 participating teams were evaluated using standard structural measures, the structural similarity with a gold standard taxonomy, and through manual quality assessment of sampled novel relations.
international semantic web conference | 2016
Stefano Faralli; Alexander Panchenko; Chris Biemann; Simone Paolo Ponzetto
We present a new hybrid lexical knowledge base that combines the contextual information of distributional models with the conciseness and precision of manually constructed lexical networks. The computation of our count-based distributional model includes the induction of word senses for single-word and multi-word terms, the disambiguation of word similarity lists, taxonomic relations extracted by patterns and context clues for disambiguation in context. In contrast to dense vector representations, our resource is human readable and interpretable, and thus can be easily embedded within the Semantic Web ecosystem.
Archive | 2008
Paolo Bottoni; Stefano Faralli; Anna Labella; Alessio Malizia; Mario Pierro; Semi Ryu
Puppetry is one of the most ancient forms of representation, diffused all over the world in different shapes, degrees of freedom in movements and forms of manipulation. Puppets make an ideal environment for creative collaboration, inspiring the development of supporting technologies (from carpentry to digital worlds). The CoPuppet project explores the possibilities offered by multimodal and cooperative interaction, in which performers, or even audience members, are called to affect different parts of a puppet through gestures and voice. In particular, we exploit an existing architecture for the creation of multimodal interfaces to develop the CoPuppet framework for designing, deploying and interacting during performances in which virtual puppets are steered by multiple multimodal controls. The paper illustrates the steps needed to define performances, also showing the integration of digital and real puppetry for the case of wayang shadowplay.
north american chapter of the association for computational linguistics | 2016
Alexander Panchenko; Stefano Faralli; Eugen Ruppert; Steffen Remus; Hubert Naets; Cédrick Fairon; Simone Paolo Ponzetto; Chris Biemann
We present a system for taxonomy construction that reached the first place in all subtasks of the SemEval 2016 challenge on Taxonomy Extraction Evaluation. Our simple yet effective approach harvests hypernyms with substring inclusion and Hearst-style lexicosyntactic patterns from domain-specific texts obtained via language model based focused crawling. Extracted taxonomies are evaluated on English, Dutch, French and Italian for three domains each (Food, Environment and Science). Evaluations against a gold standard and by human judgment show that our method outperforms more complex and knowledge-rich approaches on most domains and languages. Furthermore, to adapt the method to a new domain or language, only a small amount of manual labour is needed.
Social Network Analysis and Mining | 2015
Stefano Faralli; Giovanni Stilo; Paola Velardi
Quite a number of recent works have concentrated on the task of recommending to Twitter users whom they should follow, among which, the WTF (Who To Follow) service provided by Twitter. Recommenders are based, either on the user’s network structure, or on some notion of topical similarity with other users, or on both. In this paper, we propose to accomplish the recommendation task in two steps: First, we profile users and classify them as belonging to a target community (depending e.g., on their political affiliation, preferred football team, favorite coffee shop, etc.). Then, we fine-tune recommendations for selected populations. We cast both problems of user classification and recommendation as one of itemset mining, where items are either users’ authoritative friends or semantic categories associated to friends, extracted from WiBi, the Wikipedia Bitaxonomy. In addition to evaluating our profiler and recommender on several populations, we also show that semantic categories allow for very fine-grained population studies, and make it possible to recommend not only whom to follow, but also topics of interest, users interested in the same topic, and more.
empirical methods in natural language processing | 2017
Alexander Panchenko; Fide Marten; Eugen Ruppert; Stefano Faralli; Dmitry Ustalov; Simone Paolo Ponzetto; Chris Biemann
Interpretability of a predictive model is a powerful feature that gains the trust of users in the correctness of the predictions. In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images. We present a WSD system that bridges the gap between these two so far disconnected groups of methods. Namely, our system, providing access to several state-of-the-art WSD models, aims to be interpretable as a knowledge-based system while it remains completely unsupervised and knowledge-free. The presented tool features a Web interface for all-word disambiguation of texts that makes the sense predictions human readable by providing interpretable word sense inventories, sense representations, and disambiguation results. We provide a public API, enabling seamless integration.
Proceedings of the 1st Workshop on Sense, Concept and Entity#N# Representations and their Applications | 2017
Alexander Panchenko; Stefano Faralli; Simone Paolo Ponzetto; Chris Biemann
We introduce a new method for unsupervised knowledge-based word sense disambiguation (WSD) based on a resource that links two types of sense-aware lexical networks: one is induced from a corpus using distributional semantics, the other is manually constructed. The combination of two networks reduces the sparsity of sense representations used for WSD. We evaluate these enriched representations within two lexical sample sense disambiguation benchmarks. Our results indicate that (1) features extracted from the corpus-based resource help to significantly outperform a model based solely on the lexical resource; (2) our method achieves results comparable or better to four state-of-the-art unsupervised knowledge-based WSD systems including three hybrid systems that also rely on text corpora. In contrast to these hybrid methods, our approach does not require access to web search engines, texts mapped to a sense inventory, or machine translation systems.