Germán Kruszewski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Germán Kruszewski is active.

Explore More

Publication

Featured researches published by Germán Kruszewski.

meeting of the association for computational linguistics | 2014

Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors

Marco Baroni; Georgiana Dinu; Germán Kruszewski

Context-predicting models (more commonly known as embeddings or neural language models) are the new kids on the distributional semantics block. Despite the buzz surrounding these models, the literature is still lacking a systematic comparison of the predictive models with classic, count-vector-based distributional semantic approaches. In this paper, we perform such an extensive evaluation, on a wide range of lexical semantics tasks and across many parameter settings. The results, to our own surprise, show that the buzz is fully justified, as the context-predicting models obtain a thorough and resounding victory against their count-based counterparts.

international joint conference on natural language processing | 2015

Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model

Germán Kruszewski; Angeliki Lazaridou; Marco Baroni

We introduce C-PHRASE, a distributional semantic model that learns word representations by optimizing context prediction for phrases at all levels in a syntactic tree, from single words to full sentences. C-PHRASE outperforms the state-of-theart C-BOW model on a variety of lexical tasks. Moreover, since C-PHRASE word vectors are induced through a compositional learning objective (modeling the contexts of words combined into phrases), when they are summed, they produce sentence representations that rival those generated by ad-hoc compositional models.

meeting of the association for computational linguistics | 2016

The LAMBADA dataset: Word prediction requiring a broad discourse context

Denis Paperno; Germán Kruszewski; Angeliki Lazaridou; Ngoc Quan Pham; Raffaella Bernardi; Sandro Pezzelle; Marco Baroni; Gemma Boleda; Raquel Fernández

We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAMBADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text.

acm multimedia | 2016

How Cosmopolitan Are Emojis?: Exploring Emojis Usage and Meaning over Different Languages with Distributional Semantics

Francesco Barbieri; Germán Kruszewski; Francesco Ronzano; Horacio Saggion

Choosing the right emoji to visually complement or condense the meaning of a message has become part of our daily life. Emojis are pictures, which are naturally combined with plain text, thus creating a new form of language. These pictures are the same independently of where we live, but they can be interpreted and used in different ways. In this paper we compare the meaning and the usage of emojis across different languages. Our results suggest that the overall semantics of the subset of the emojis we studied is preserved across all the languages we analysed. However, some emojis are interpreted in a different way from language to language, and this could be related to socio-geographical differences.

north american chapter of the association for computational linguistics | 2015

So similar and yet incompatible: Toward the automated identification of semantically compatible words.

Germán Kruszewski; Marco Baroni

We introduce the challenge of detecting semantically compatible words, that is, words that can potentially refer to the same thing (cat and hindrance are compatible, cat and dog are not), arguing for its central role in many semantic tasks. We present a publicly available data-set of human compatibility ratings, and a neural-network model that takes distributional embeddings of words as input and learns alternative embeddings that perform the compatibility detection task quite well.

empirical methods in natural language processing | 2016

Convolutional neural network language models

Ngoc-Quan Pham; Germán Kruszewski; Gemma Boleda

Convolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence classification for Sentiment Analysis or relation extraction. In this work, we study the application of CNNs to language modeling, a dynamic, sequential prediction task that needs models to capture local as well as long-range dependency information. Our contribution is twofold. First, we show that CNNs achieve 11-26% better absolute performance than feed-forward neural language models, demonstrating their potential for language representation even in sequential tasks. As for recurrent models, our model outperforms RNNs but is below state of the art LSTM models. Second, we gain some understanding of the behavior of the model, showing that CNNs in language act as feature detectors at a high level of abstraction, like in Computer Vision, and that the model can profitably use information from as far as 16 words before the target.

joint conference on lexical and computational semantics | 2014

Dead parrots make bad pets: Exploring modifier effects in noun phrases

Germán Kruszewski; Marco Baroni

Sometimes modifiers have a strong effect on core aspects of the meaning of the nouns they are attached to: A parrot is a desirable pet, but a dead parrot is, at the very least, a rather unusual household companion. In order to stimulate computational research into the impact of modification on phrase meaning, we collected and made available a large dataset containing subject ratings for a variety of noun phrases and the categories they might belong to. We propose to use compositional distributional semantics to model these data, experimenting with numerous distributional semantic spaces, phrase composition methods and asymmetric similarity measures. Our models capture a statistically significant portion of the data, although much work is still needed before we achieve a full computational account of modification effects.

Computational Linguistics | 2016

There is no logical negation here, but there are alternatives: Modeling conversational negation with distributional semantics

Germán Kruszewski; Denis Paperno; Raffaella Bernardi; Marco Baroni

Logical negation is a challenge for distributional semantics, because predicates and their negations tend to occur in very similar contexts, and consequently their distributional vectors are very similar. Indeed, it is not even clear what properties a “negated” distributional vector should possess. However, when linguistic negation is considered in its actual discourse usage, it often performs a role that is quite different from straightforward logical negation. If someone states, in the middle of a conversation, that “This is not a dog,” the negation strongly suggests a restricted set of alternative predicates that might hold true of the object being talked about. In particular, other canids and middle-sized mammals are plausible alternatives, birds are less likely, skyscrapers and other large buildings virtually impossible. Conversational negation acts like a graded similarity function, of the sort that distributional semantics might be good at capturing. In this article, we introduce a large data set of alternative plausibility ratings for conversationally negated nominal predicates, and we show that simple similarity in distributional semantic space provides an excellent fit to subject data. On the one hand, this fills a gap in the literature on conversational negation, proposing distributional semantics as the right tool to make explicit predictions about potential alternatives of negated predicates. On the other hand, the results suggest that negation, when addressed from a broader pragmatic perspective, far from being a nuisance, is an ideal application domain for distributional semantic methods.

meeting of the association for computational linguistics | 2018