Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nora Aranberri is active.

Publication


Featured researches published by Nora Aranberri.


language resources and evaluation | 2016

TweetLID : a benchmark for tweet language identification

Arkaitz Zubiaga; Iñaki San Vicente; Pablo Gamallo; José Ramom Pichel; Iñaki Alegria; Nora Aranberri; Aitzol Ezeiza; Víctor Fresno

Language identification, as the task of determining the language a given text is written in, has progressed substantially in recent decades. However, three main issues remain still unresolved: (1) distinction of similar languages, (2) detection of multilingualism in a single document, and (3) identifying the language of short texts. In this paper, we describe our work on the development of a benchmark to encourage further research in these three directions, set forth an evaluation framework suitable for the task, and make a dataset of annotated tweets publicly available for research purposes. We also describe the shared task we organized to validate and assess the evaluation framework and dataset with systems submitted by seven different participants, and analyze the performance of these systems. The evaluation of the results submitted by the participants of the shared task helped us shed some light on the shortcomings of state-of-the-art language identification systems, and gives insight into the extent to which the brevity, multilingualism, and language similarity found in texts exacerbate the performance of language identifiers. Our dataset with nearly 35,000 tweets and the evaluation framework provide researchers and practitioners with suitable resources to further study the aforementioned issues on language identification within a common setting that enables to compare results with one another.


language resources and evaluation | 2015

TweetNorm: a benchmark for lexical normalization of Spanish tweets

Iñaki Alegria; Nora Aranberri; Pere R. Comas; Víctor Fresno; Pablo Gamallo; Lluís Padró; Iñaki San Vicente; Jordi Turmo; Arkaitz Zubiaga

Abstract The language used in social media is often characterized by the abundance of informal and non-standard writing. The normalization of this non-standard language can be crucial to facilitate the subsequent textual processing and to consequently help boost the performance of natural language processing tools applied to social media text. In this paper we present a benchmark for lexical normalization of social media posts, specifically for tweets in Spanish language. We describe the tweet normalization challenge we organized recently, analyze the performance achieved by the different systems submitted to the challenge, and delve into the characteristics of systems to identify the features that were useful. The organization of this challenge has led to the production of a benchmark for lexical normalization of social media, including an evaluation framework, as well as an annotated corpus of Spanish tweets—TweetNorm_es—, which we make publicly available. The creation of this benchmark and the evaluation has brought to light the types of words that submitted systems did best with, and posits the main shortcomings to be addressed in future work.


Procesamiento Del Lenguaje Natural | 2014

Overview of TweetLID: Tweet Language Identification at SEPLN 2014.

Arkaitz Zubiaga; Iñaki San Vicente; Pablo Gamallo; José Ramom Pichel Campos; Iñaki Alegría Loinaz; Nora Aranberri; Aitzol Ezeiza; Víctor Fresno-Fernández


Actas del XXIX Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural | 2013

Introducción a la tarea compartida Tweet-Norm 2013: Normalización léxica de tuits en español

Lluís Padró; Jorge Turmo Borras; Iñaki Alegria; Nora Aranberri; Víctor Fresno; Pablo Samallo; Iñaki San Vicente; Arkaitz Zubiaga


Proceedings of the Tweet Translation Workshop (TweetMT) | 2015

Overview of TweetMT : a shared task on machine translation of tweets at SEPLN 2015

Iñaki Alegria; Nora Aranberri; Cristina España-Bonet; Pablo Gamallo; Hugo Gonçalo Oliveira; Eva Martínez Garcia; Iñaki San Vicente; Antonio Toral; Arkaitz Zubiaga


language resources and evaluation | 2016

QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages.

Arantxa Otegi; Nora Aranberri; António Branco; Jan Hajic; Martin Popel; Kiril Simov; Eneko Agirre; Petya Osenova; Rita Valadas Pereira; João Ricardo Silva; Steven Neale


Proceedings of the 18th Annual Conference of the European Association for Machine Translation | 2015

Exploiting portability to build an RBMT prototype for a new source language

Nora Aranberri; Gorka Labaka; Arantza Díaz de Ilarraza; Kepa Sarasola


language resources and evaluation | 2016

TweetMT : a parallel microblog corpus

Iñaki San Vicente; Iñaki Alegria; Cristina España-Bonet; Pablo Gamallo; Hugo Gonçalo Oliveira; Eva Martínez Garcia; Antonio Toral; Arkaitz Zubiaga; Nora Aranberri


language resources and evaluation | 2016

Tools and Guidelines for Principled Machine Translation Development.

Nora Aranberri; Eleftherios Avramidis; Aljoscha Burchardt; Ondrej Klejch; Martin Popel; Maja Popović


Revista tradumàtica: traducció i tecnologies de la informació i la comunicació | 2014

Posedición, productividad y calidad

Nora Aranberri

Collaboration


Dive into the Nora Aranberri's collaboration.

Top Co-Authors

Avatar

Iñaki Alegria

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Pablo Gamallo

University of Santiago de Compostela

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arantza Díaz de Ilarraza

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Eneko Agirre

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Gorka Labaka

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Víctor Fresno

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Kepa Sarasola

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Lluís Padró

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Aitzol Ezeiza

University of the Basque Country

View shared research outputs
Researchain Logo
Decentralizing Knowledge