Nora Aranberri
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nora Aranberri.
language resources and evaluation | 2016
Arkaitz Zubiaga; Iñaki San Vicente; Pablo Gamallo; José Ramom Pichel; Iñaki Alegria; Nora Aranberri; Aitzol Ezeiza; Víctor Fresno
Language identification, as the task of determining the language a given text is written in, has progressed substantially in recent decades. However, three main issues remain still unresolved: (1) distinction of similar languages, (2) detection of multilingualism in a single document, and (3) identifying the language of short texts. In this paper, we describe our work on the development of a benchmark to encourage further research in these three directions, set forth an evaluation framework suitable for the task, and make a dataset of annotated tweets publicly available for research purposes. We also describe the shared task we organized to validate and assess the evaluation framework and dataset with systems submitted by seven different participants, and analyze the performance of these systems. The evaluation of the results submitted by the participants of the shared task helped us shed some light on the shortcomings of state-of-the-art language identification systems, and gives insight into the extent to which the brevity, multilingualism, and language similarity found in texts exacerbate the performance of language identifiers. Our dataset with nearly 35,000 tweets and the evaluation framework provide researchers and practitioners with suitable resources to further study the aforementioned issues on language identification within a common setting that enables to compare results with one another.
language resources and evaluation | 2015
Iñaki Alegria; Nora Aranberri; Pere R. Comas; Víctor Fresno; Pablo Gamallo; Lluís Padró; Iñaki San Vicente; Jordi Turmo; Arkaitz Zubiaga
Abstract The language used in social media is often characterized by the abundance of informal and non-standard writing. The normalization of this non-standard language can be crucial to facilitate the subsequent textual processing and to consequently help boost the performance of natural language processing tools applied to social media text. In this paper we present a benchmark for lexical normalization of social media posts, specifically for tweets in Spanish language. We describe the tweet normalization challenge we organized recently, analyze the performance achieved by the different systems submitted to the challenge, and delve into the characteristics of systems to identify the features that were useful. The organization of this challenge has led to the production of a benchmark for lexical normalization of social media, including an evaluation framework, as well as an annotated corpus of Spanish tweets—TweetNorm_es—, which we make publicly available. The creation of this benchmark and the evaluation has brought to light the types of words that submitted systems did best with, and posits the main shortcomings to be addressed in future work.
Procesamiento Del Lenguaje Natural | 2014
Arkaitz Zubiaga; Iñaki San Vicente; Pablo Gamallo; José Ramom Pichel Campos; Iñaki Alegría Loinaz; Nora Aranberri; Aitzol Ezeiza; Víctor Fresno-Fernández
Actas del XXIX Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural | 2013
Lluís Padró; Jorge Turmo Borras; Iñaki Alegria; Nora Aranberri; Víctor Fresno; Pablo Samallo; Iñaki San Vicente; Arkaitz Zubiaga
Proceedings of the Tweet Translation Workshop (TweetMT) | 2015
Iñaki Alegria; Nora Aranberri; Cristina España-Bonet; Pablo Gamallo; Hugo Gonçalo Oliveira; Eva Martínez Garcia; Iñaki San Vicente; Antonio Toral; Arkaitz Zubiaga
language resources and evaluation | 2016
Arantxa Otegi; Nora Aranberri; António Branco; Jan Hajic; Martin Popel; Kiril Simov; Eneko Agirre; Petya Osenova; Rita Valadas Pereira; João Ricardo Silva; Steven Neale
Proceedings of the 18th Annual Conference of the European Association for Machine Translation | 2015
Nora Aranberri; Gorka Labaka; Arantza Díaz de Ilarraza; Kepa Sarasola
language resources and evaluation | 2016
Iñaki San Vicente; Iñaki Alegria; Cristina España-Bonet; Pablo Gamallo; Hugo Gonçalo Oliveira; Eva Martínez Garcia; Antonio Toral; Arkaitz Zubiaga; Nora Aranberri
language resources and evaluation | 2016
Nora Aranberri; Eleftherios Avramidis; Aljoscha Burchardt; Ondrej Klejch; Martin Popel; Maja Popović
Revista tradumàtica: traducció i tecnologies de la informació i la comunicació | 2014
Nora Aranberri