Denis Paperno | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Denis Paperno is active.

Explore More

Publication

Featured researches published by Denis Paperno.

meeting of the association for computational linguistics | 2014

A practical and linguistically-motivated approach to compositional distributional semantics

Denis Paperno; Marco Baroni

Distributional semantic methods to approximate word meaning with context vectors have been very successful empirically, and the last years have seen a surge of interest in their compositional extension to phrases and sentences. We present here a new model that, like those of Coecke et al. (2010) and Baroni and Zamparelli (2010), closely mimics the standard Montagovian semantic treatment of composition in distributional terms. However, our approach avoids a number of issues that have prevented the application of the earlier linguistically-motivated models to full-fledged, real-life sentences. We test the model on a variety of empirical tasks, showing that it consistently outperforms a set of competitive rivals. 1 Compositional distributional semantics The research of the last two decades has established empirically that distributional vectors for words obtained from corpus statistics can be used

meeting of the association for computational linguistics | 2016

The LAMBADA dataset: Word prediction requiring a broad discourse context

Denis Paperno; Germán Kruszewski; Angeliki Lazaridou; Ngoc Quan Pham; Raffaella Bernardi; Sandro Pezzelle; Marco Baroni; Gemma Boleda; Raquel Fernández

We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAMBADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text.

arXiv: Computation and Language | 2016

Human and Machine Judgements for Russian Semantic Relatedness

Alexander Panchenko; Dmitry Ustalov; Nikolay Arefyev; Denis Paperno; Natalia Konstantinova; Natalia V. Loukachevitch; Chris Biemann

Semantic relatedness of terms represents similarity of meaning by a numerical score. On the one hand, humans easily make judgements about semantic relatedness. On the other hand, this kind of information is useful in language processing systems. While semantic relatedness has been extensively studied for English using numerous language resources, such as associative norms, human judgements and datasets generated from lexical databases, no evaluation resources of this kind have been available for Russian to date. Our contribution addresses this problem. We present five language resources of different scale and purpose for Russian semantic relatedness, each being a list of triples \(({word}_{i}, {word}_{j}, {similarity}_{ij}\)). Four of them are designed for evaluation of systems for computing semantic relatedness, complementing each other in terms of the semantic relation type they represent. These benchmarks were used to organise a shared task on Russian semantic relatedness, which attracted 19 teams. We use one of the best approaches identified in this competition to generate the fifth high-coverage resource, the first open distributional thesaurus of Russian. Multiple evaluations of this thesaurus, including a large-scale crowdsourcing study involving native speakers, indicate its high accuracy.

joint conference on lexical and computational semantics | 2015

Leveraging Preposition Ambiguity to Assess Compositional Distributional Models of Semantics

Samuel Ritter; Cotie Long; Denis Paperno; Marco Baroni; Matthew Botvinick; Adele E. Goldberg

Complex interactions among the meanings of words are important factors in the function that maps word meanings to phrase meanings. Recently, compositional distributional semantics models (CDSM) have been designed with the goal of emulating these complex interactions; however, experimental results on the effectiveness of CDSM have been difficult to interpret because the current metrics for assessing them do not control for the confound of lexical information. We present a new method for assessing the degree to which CDSM capture semantic interactions that dissociates the influences of lexical and compositional information. We then provide a dataset for performing this type of assessment and use it to evaluate six compositional models using both co-occurrence based and neural language model input vectors. Results show that neural language input vectors are consistently superior to co-occurrence based vectors, that several CDSM capture substantial compositional information, and that, surprisingly, vector addition matches and is in many cases superior to purpose-built paramaterized models.

Computational Linguistics | 2016

When the whole is less than the sum of its parts: How composition affects pmi values in distributional semantic vectors

Denis Paperno; Marco Baroni

Distributional semantic models, deriving vector-based word representations from patterns of word usage in corpora, have many useful applications (Turney and Pantel 2010). Recently, there has been interest in compositional distributional models, which derive vectors for phrases from representations of their constituent words (Mitchell and Lapata 2010). Often, the values of distributional vectors are pointwise mutual information (PMI) scores obtained from raw co-occurrence counts. In this article we study the relation between the PMI dimensions of a phrase vector and its components in order to gain insights into which operations an adequate composition model should perform. We show mathematically that the difference between the PMI dimension of a phrase vector and the sum of PMIs in the corresponding dimensions of the phrases parts is an independently interpretable value, namely, a quantification of the impact of the context associated with the relevant dimension on the phrases internal cohesion, as also measured by PMI. We then explore this quantity empirically, through an analysis of adjective–noun composition.

Cognitive Psychology | 2014

Corpus-based estimates of word association predict biases in judgment of word co-occurrence likelihood

Denis Paperno; Marco Marelli; Katya Tentori; Marco Baroni

This paper draws a connection between statistical word association measures used in linguistics and confirmation measures from epistemology. Having theoretically established the connection, we replicate, in the new context of the judgments of word co-occurrence, an intriguing finding from the psychology of reasoning, namely that confirmation values affect intuitions about likelihood. We show that the effect, despite being based in this case on very subtle statistical insights about thousands of words, is stable across three different experimental settings. Our theoretical and empirical results suggest that factors affecting traditional reasoning tasks are also at play when linguistic knowledge is probed, and they provide further evidence for the importance of confirmation in a new domain.

empirical methods in natural language processing | 2015

Distributional Semantics in Use

Raffaella Bernardi; Gemma Boleda; Raquel Fernández; Denis Paperno

In this position paper we argue that an adequate semantic model must account for language in use, taking into account how discourse context affects the meaning of words and larger linguistic units. Distributional semantic models are very attractive models of meaning mainly because they capture conceptual aspects and are automatically induced from natural language data. However, they need to be extended in order to account for language use in a discourse or dialogue context. We discuss phenomena that the new generation of distributional semantic models should capture, and propose concrete tasks on which they could be tested.

workshop on evaluating vector space representations for nlp | 2016

Capturing Discriminative Attributes in a Distributional Space: Task Proposal.

Alicia Krebs; Denis Paperno

If lexical similarity is not enough to reliably assess how word vectors would perform on various specific tasks, we need other ways of evaluating semantic representations. We propose a new task, which consists in extracting semantic differences using distributional models: given two words, what is the difference between their meanings? We present two proof of concept datasets for this task and outline how it may be performed.

meeting of the association for computational linguistics | 2016

When Hyperparameters Help: Beneficial Parameter Combinations in Distributional Semantic Models.

Alicia Krebs; Denis Paperno

Distributional semantic models can predict many linguistic phenomena, including word similarity, lexical ambiguity, and semantic priming, or even to pass TOEFL synonymy and analogy tests (Landauer and Dumais, 1997; Griffiths et al., 2007; Turney and Pantel, 2010). But what does it take to create a competitive distributional model? Levy et al. (2015) argue that the key to success lies in hyperparameter tuning rather than in the model’s architecture. More hyperparameters trivially lead to potential performance gains, but what do they actually do to improve the models? Are individual hyperparameters’ contributions independent of each other? Or are only specific parameter combinations beneficial? To answer these questions, we perform a quantitative and qualitative evaluation of major hyperparameters as identified in previous research.

Computational Linguistics | 2016

There is no logical negation here, but there are alternatives: Modeling conversational negation with distributional semantics

Germán Kruszewski; Denis Paperno; Raffaella Bernardi; Marco Baroni

Logical negation is a challenge for distributional semantics, because predicates and their negations tend to occur in very similar contexts, and consequently their distributional vectors are very similar. Indeed, it is not even clear what properties a “negated” distributional vector should possess. However, when linguistic negation is considered in its actual discourse usage, it often performs a role that is quite different from straightforward logical negation. If someone states, in the middle of a conversation, that “This is not a dog,” the negation strongly suggests a restricted set of alternative predicates that might hold true of the object being talked about. In particular, other canids and middle-sized mammals are plausible alternatives, birds are less likely, skyscrapers and other large buildings virtually impossible. Conversational negation acts like a graded similarity function, of the sort that distributional semantics might be good at capturing. In this article, we introduce a large data set of alternative plausibility ratings for conversationally negated nominal predicates, and we show that simple similarity in distributional semantic space provides an excellent fit to subject data. On the one hand, this fills a gap in the literature on conversational negation, proposing distributional semantics as the right tool to make explicit predictions about potential alternatives of negated predicates. On the other hand, the results suggest that negation, when addressed from a broader pragmatic perspective, far from being a nuisance, is an ideal application domain for distributional semantic methods.

Explore More