Valerio Basile
University of Groningen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Valerio Basile.
Handbook of Linguistic Annotation | 2017
Johan Bos; Valerio Basile; Kilian Evang; Noortje Venhuizen; Johannes Bjerva
The goal of the Groningen Meaning Bank (GMB) is to obtain a large corpus of English texts annotated with formal meaning representations. Since manually annotating a comprehensive corpus with deep semantic representations is a hard and time-consuming task, we employ a sophisticated bootstrapping approach. This method employs existing language technology tools (for segmentation, part-of-speech tagging, named entity tagging, animacy labelling, syntactic parsing, and semantic processing) to get a reasonable approximation of the target annotations as a starting point. The machine-generated annotations are then refined by information obtained from both expert linguists (using a wiki-like platform) and crowd-sourcing methods (in the form of a ‘Game with a Purpose’) which help us in deciding how to resolve syntactic and semantic ambiguities. The result is a semantic resource that integrates various linguistic phenomena, including predicate-argument structure, scope, tense, thematic roles, rhetorical relations and presuppositions. The semantic formalism that brings all levels of annotation together in one meaning representation is Discourse Representation Theory, which supports meaning representations that can be translated to first-order logic. In contrast to ordinary treebanks, the units of annotation in the GMB are texts, rather than isolated sentences. The current version of the GMB contains more than 10,000 public domain texts aligned with Discourse Representation Structures, and is freely available for research purposes.
Intelligenza Artificiale | 2015
Giuseppe Attardi; Valerio Basile; Cristina Bosco; Tommaso Caselli; Felice Dell'Orletta; Simonetta Montemagni; Viviana Patti; Maria Simi; Rachele Sprugnoli
Shared task evaluation campaigns represent a well established form of competitive evaluation, an important opportunity to propose and tackle new challenges for a specific research area and a way to foster the development of benchmarks, tools and resources. The advantages of this approach are evident in any experimental field, including the area of Natural Language Processing. An outlook on state–of–the–art language technologies for Italian can be obtained by reflecting on the results of the recently held workshop “Evaluation of NLP and Speech Tools for Italian”, EVALITA 2014. The motivations underlying individual shared tasks, the level of knowledge and development achieved within each of them, the impact on applications, society and economy at large as well as directions for future research will be discussed from this perspective.
knowledge acquisition, modeling and management | 2016
Valerio Basile; Soufian Jebbara; Elena Cabrio; Philipp Cimiano
The paper presents an approach to extract knowledge from large text corpora, in particular knowledge that facilitates object manipulation by embodied intelligent systems that need to act in the world. As a first step, our goal is to extract the prototypical location of given objects from text corpora. We approach this task by calculating relatedness scores for objects and locations using techniques from distributional semantics. We empirically compare different methods for representing locations and objects as vectors in some geometric space, and we evaluate them with respect to a crowd-sourced gold standard in which human subjects had to rate the prototypicality of a location given an object. By applying the proposed framework on DBpedia, we are able to build a knowledge base of 931 high confidence object-locations relations in a fully automatic fashion The work in this paper is partially funded by the ALOOF project CHIST-ERA program.
international conference on robotics and automation | 2017
Jay Young; Lars Kunze; Valerio Basile; Elena Cabrio; Nick Hawes; Barbara Caputo
Autonomous robots that are to assist humans in their daily lives must recognize and understand the meaning of objects in their environment. However, the open nature of the world means robots must be able to learn and extend their knowledge about previously unknown objects on-line. In this work we investigate the problem of unknown object hypotheses generation, and employ a semantic web-mining framework along with deep-learning-based object detectors. This allows us to make use of both visual and semantic features in combined hypotheses generation. Experiments on data from mobile robots in real world application deployments show that this combination improves performance over the use of either method in isolation.
international semantic web conference | 2016
Valerio Basile; Elena Cabrio; Fabien Gandon
In this paper we present an ongoing work on building a repository of knowledge about objects typically found in homes, their usual locations and usage. We extract an RDF knowledge base by automatically reading text on the Web and applying simple inference rules. The obtained common sense object relations are ready to be used in a domestic robotic setting, e.g. “a frying pan is usually located in the kitchen”.
european semantic web conference | 2017
Jay Young; Valerio Basile; Markus Suchi; Lars Kunze; Nick Hawes; Markus Vincze; Barbara Caputo
Intelligent Autonomous Robots deployed in human environments must have understanding of the wide range of possible semantic identities associated with the spaces they inhabit – kitchens, living rooms, bathrooms, offices, garages, etc. We believe robots should learn this information through their own exploration and situated perception in order to uncover and exploit structure in their environments – structure that may not be apparent to human engineers, or that may emerge over time during a deployment. In this work, we combine semantic web-mining and situated robot perception to develop a system capable of assigning semantic categories to regions of space. This is accomplished by looking at web-mined relationships between room categories and objects identified by a Convolutional Neural Network trained on 1000 categories. Evaluated on real-world data, we show that our system exhibits several conceptual and technical advantages over similar systems, and uncovers semantic structure in the environment overlooked by ground-truth annotators.
data and knowledge engineering | 2014
Gosse Bouma; Valerio Basile; Ashwin Ittoo; Elisabeth Métais; Hans Wortmann
This special issue consists of four revised and extended papers selected from the 17th International conference on Applications of Natural Language to Information Systems (NLDB 2012), which was organized in June 2012 in Groningen, the Netherlands. The NLDB conferences bring together researchers interested in the use of natural language processing techniques in applications such as database querying, information systems architecture, software development, and more recently, various applications dealing with automatic enrichment of and advanced search in large volumes of web content. Authors of a selected number of contributions to the conference (12 full papers, 24 short papers, and 17 poster presentations) were asked to develop their paper into a journal article. After additional reviewing, 4 papers were selected for this special issue. ‘A semi supervised learning model for mapping sentences to logical forms with ambiguous supervision’ by Minh Le Nguyen and Akira Shimazu proposes a novel method for bootstrapping a semantic parser from a combination of semantically annotated, but not disambiguated, data and unlabeled data. ‘Wikimantic: Toward effective disambiguation and expansion of queries’ byChristopher Boston, Hui Fang, Sandra Carberry, Hao Wu, and Xitong Liu is concerned with microblog (tweet) retrieval, and shows that a query disambiguation and expansion method which uses Wikipedia as knowledge resource improves retrieval performance. ‘Inducing the contextual and prior polarity of nouns from the induced polarity preference of verbs’ by Manfred Klenner and Stefanos Petrakis presents a new method for learning the prior negative or positive sentiment of a noun, based on statistics obtained from a large corpus annotated with syntactic dependency relations. ‘Multidimensional topic analysis in political texts’ by Cäcilia Zirn and Heiner Stuckenschmidt addresses the issue of automatic content analysis in the social sciences. In particular, it shows that a technique for creating topic models based on party manifestos and coalition contracts is able to predict which party was primarily responsible for the text of which ministry. On behalf of the authors of the selected papers, we would like to thank all reviewers for their detailed and constructive comments on the submitted manuscripts.
language resources and evaluation | 2012
Valerio Basile; Johan Bos; Kilian Evang; Noortje Venhuizen
north american chapter of the association for computational linguistics | 2013
Valerio Basile; Malvina Nissim
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) -- Short Papers | 2013
Noortje Venhuizen; Valerio Basile; Kilian Evang; Johan Bos