Héctor Martínez Alonso

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Héctor Martínez Alonso is active.

Explore More

Publication

Featured researches published by Héctor Martínez Alonso.

international joint conference on natural language processing | 2015

Inverted indexing for cross-lingual NLP

Anders Søgaard; Żeljko Agić; Héctor Martínez Alonso; Barbara Plank; Bernd Bohnet; Anders Johannsen

We present a novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia. We present experiments applying these representations to 17 datasets in document classification, POS tagging, dependency parsing, and word alignment. Our approach has the advantage that it is simple, computationally efficient and almost parameter-free, and, more importantly, it enables multi-source crosslingual learning. In 14/17 cases, we improve over using state-of-the-art bilingual embeddings.

joint conference on lexical and computational semantics | 2014

More or less supervised supersense tagging of Twitter

Anders Johannsen; Dirk Hovy; Héctor Martínez Alonso; Barbara Plank; Anders Søgaard

We present two Twitter datasets annotated with coarse-grained word senses (supersenses), as well as a series of experiments with three learning scenarios for supersense tagging: weakly supervised learning, as well as unsupervised and supervised domain adaptation. We show that (a) off-the-shelf tools perform poorly on Twitter, (b) models augmented with embeddings learned from Twitter data perform much better, and (c) errors can be reduced using type-constrained inference with distant supervision from WordNet.

conference on computational natural language learning | 2014

What's in a p-value in NLP?

Anders Søgaard; Anders Johannsen; Barbara Plank; Dirk Hovy; Héctor Martínez Alonso

In NLP, we need to document that our proposed methods perform significantly better with respect to standard metrics than previous approaches, typically by reporting p-values obtained by rank- or randomization-based tests. We show that significance results following current research standards are unreliable and, in addition, very sensitive to sample size, covariates such as sentence length, as well as to the existence of multiple metrics. We estimate that under the assumption of perfect metrics and unbiased data, we need a significance cut-off at ⇠0.0025 to reduce the risk of false positive results to <5%. Since in practice we often have considerable selection bias and poor metrics, this, however, will not do alone.

empirical methods in natural language processing | 2015

Any-language frame-semantic parsing

Anders Johannsen; Héctor Martínez Alonso; Anders Søgaard

We present a multilingual corpus of Wikipedia and Twitter texts annotated with FRAMENET 1.5 semantic frames in nine different languages, as well as a novel technique for weakly supervised cross-lingual frame-semantic parsing. Our approach only assumes the existence of linked, comparable source and target language corpora (e.g., Wikipedia) and a bilingual dictionary (e.g., Wiktionary or BABELNET). Our approach uses a truly interlingual representation, enabling us to use the same model across all nine languages. We present average error reductions over running a state-of-the-art parser on word-to-word translations of 46% for target identification, 37% for frame identification, and 14% for argument identification.

north american chapter of the association for computational linguistics | 2015

Mining for unambiguous instances to adapt part-of-speech taggers to new domains

Dirk Hovy; Barbara Plank; Héctor Martínez Alonso; Anders Søgaard

We present a simple, yet effective approach to adapt part-of-speech (POS) taggers to new domains. Our approach only requires a dictionary and large amounts of unlabeled target data. The idea is to use the dictionary to mine the unlabeled target data for unambiguous word sequences, thus effectively collecting labeled target data. We add the mined instances to available labeled newswire data to train a POS tagger for the target domain. The induced models significantly improve tagging accuracy on held-out test sets across three domains (Twitter, spoken language, and search queries). We also present results for Dutch, Spanish and Portuguese Twitter data, and provide two novel manually-annotated test sets.

international conference on computational linguistics | 2014

Copenhagen-Malmö: Tree Approximations of Semantic Parsing Problems

Natalie Schluter; Anders Søgaard; Jakob Elming; Dirk Hovy; Barbara Plank; Héctor Martínez Alonso; Anders Johanssen; Sigrid Klerke

In this shared task paper for SemEval2014 Task 8, we show that most semantic structures can be approximated by trees through a series of almost bijective graph transformations. We transform input graphs, apply off-the-shelf methods from syntactic parsing on the resulting trees, and retrieve output graphs. Using tree approximations, we obtain good results across three semantic formalisms, with a 15.9% error reduction over a stateof-the-art semantic role labeling system on development data. Our system came in 3/6 in the shared task closed track.

north american chapter of the association for computational linguistics | 2015

Learning to parse with IAA-weighted loss

Héctor Martínez Alonso; Barbara Plank; Arne Skjærholt; Anders Søgaard

Natural language processing (NLP) annotation projects employ guidelines to maximize inter-annotator agreement (IAA), and models are estimated assuming that there is one single ground truth. However, not all disagreement is noise, and in fact some of it may contain valuable linguistic information. We integrate such information in the training of a cost-sensitive dependency parser. We introduce five different factorizations of IAA and the corresponding loss functions, and evaluate these across six different languages. We obtain robust improvements across the board using a factorization that considers dependency labels and directionality. The best method-dataset combination reaches an average overall error reduction of 6.4% in labeled attachment score.

linguistic annotation workshop | 2015

Non-canonical language is not harder to annotate than canonical language

Barbara Plank; Héctor Martínez Alonso; Anders Søgaard

As researchers developing robust NLP for a wide range of text types, we are often confronted with the prejudice that annotation of non-canonical language (whatever that means) is somehow more arbitrary than annotation of canonical language. To investigate this, we present a small annotation study where annotators were asked, with minimal guidelines, to identify main predicates and arguments in sentences across five different domains, ranging from newswire to Twitter. Our study indicates that (at least such) annotation of non-canonical language is not harder. However, we also observe that agreements in social media domains correlate less with model confidence, suggesting that maybe annotators disagree for different reasons when annotating social media data.

conference on computational natural language learning | 2015

Do dependency parsing metrics correlate with human judgments

Barbara Plank; Héctor Martínez Alonso; Żeljko Agić; Danijela Merkler; Anders Søgaard

Using automatic measures such as labeled and unlabeled attachment scores is common practice in dependency parser evaluation. In this paper, we examine whether these measures correlate with human judgments of overall parse quality. We ask linguists with experience in dependency annotation to judge system outputs. We measure the correlation between their judgments and a range of parse evaluation metrics across five languages. The humanmetric correlation is lower for dependency parsing than for other NLP tasks. Also, inter-annotator agreement is sometimes higher than the agreement between judgments and metrics, indicating that the standard metrics fail to capture certain aspects of parse quality, such as the relevance of root attachment or the relative importance of the different parts of speech.

north american chapter of the association for computational linguistics | 2015

CPH: Sentiment analysis of Figurative Language on Twitter #easypeasy #not

Sarah McGillion; Héctor Martínez Alonso; Barbara Plank

This paper describes the details of our system submitted to the SemEval 2015 shared task on sentiment analysis of figurative language on Twitter. We tackle the problem as regression task and combine several base systems using stacked generalization (Wolpert, 1992). An initial analysis revealed that the data is heavily biased, and a general sentiment analysis system (GSA) performs poorly on it. However, GSA proved helpful on the test data, which contains an estimated 25% nonfigurative tweets. Our best system, a stacking system with backoff to GSA, ranked 4th on the final test data (Cosine 0.661, MSE 3.404). 1

Explore More