Larraitz Uria
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Larraitz Uria.
north american chapter of the association for computational linguistics | 2015
Eneko Agirre; Carmen Banea; Claire Cardie; Daniel M. Cer; Mona T. Diab; Aitor Gonzalez-Agirre; Weiwei Guo; Iñigo Lopez-Gazpio; Montse Maritxalar; Rada Mihalcea; German Rigau; Larraitz Uria; Janyce Wiebe
In semantic textual similarity (STS), systems rate the degree of semantic equivalence between two text snippets. This year, the participants were challenged with new datasets in English and Spanish. The annotations for both subtasks leveraged crowdsourcing. The English subtask attracted 29 teams with 74 system runs, and the Spanish subtask engaged 7 teams participating with 16 system runs. In addition, this year we ran a pilot task on interpretable STS, where the systems needed to add an explanatory layer, that is, they had to align the chunks in the sentence pair, explicitly annotating the kind of relation and the score of the chunk pair. The train and test data were manually annotated by an expert, and included headline and image sentence pairs from previous years. 7 teams participated with 29 runs.
intelligent tutoring systems | 2006
Itziar Aldabe; Maddalen Lopez de Lacalle; Montse Maritxalar; Edurne Martinez; Larraitz Uria
Knowledge construction is expensive for Computer Assisted Assessment. When setting exercise questions, teachers use Test Makers to construct Question Banks. The addition of Automatic Generation to assessment applications decreases the time spent on constructing examination papers. In this article, we present ArikIturri, an Automatic Question Generator for Basque language test questions, which is independent from the test assessment application that uses it. The information source for this question generator consists of linguistically analysed real corpora, represented in XML mark-up language. ArikIturri makes use of NLP tools. The influence of the robustness of those tools and the used corpora is highlighted in the article. We have proved the viability of ArikIturri when constructing fill-in-the-blank, word formation, multiple choice, and error correction question types. In the evaluation of this automatic generator, we have obtained positive results as regards the generation process and its usefulness.
conference on intelligent text processing and computational linguistics | 2004
Itziar Aduriz; Maxux J. Aranzabe; Jose Maria Arriola; Arantza Díaz de Ilarraza; Koldo Gojenola; Maite Oronoz; Larraitz Uria
This article presents a robust syntactic analyser for Basque and the different modules it contains. Each module is structured in different analysis layers for which each layer takes the information provided by the previous layer as its input; thus creating a gradually deeper syntactic analysis in cascade. This analysis is carried out using the Constraint Grammar (CG) formalism. Moreover, the article describes the standardisation process of the parsing formats using XML.
north american chapter of the association for computational linguistics | 2016
Eneko Agirre; Aitor Gonzalez-Agirre; Iñigo Lopez-Gazpio; Montse Maritxalar; German Rigau; Larraitz Uria
Comunicacio presentada al 10th International Workshop on Semantic Evaluation (SemEval-2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, California.
Knowledge Based Systems | 2017
Iñigo Lopez-Gazpio; Montse Maritxalar; Aitor Gonzalez-Agirre; German Rigau; Larraitz Uria; Eneko Agirre
We address interpretability, the ability of machines to explain their reasoning.We formalize it for textual similarity as graded typed alignment between 2 sentences.We release an annotated dataset and build and evaluate a high performance system.We show that the output of the system can be used to produce explanations.2 user studies show preliminary evidence that explanations help humans perform better. User acceptance of artificial intelligence agents might depend on their ability to explain their reasoning to the users. We focus on a specific text processing task, the Semantic Textual Similarity task (STS), where systems need to measure the degree of semantic equivalence between two sentences. We propose to add an interpretability layer (iSTS for short) formalized as the alignment between pairs of segments across the two sentences, where the relation between the segments is labeled with a relation type and a similarity score. This way, a system performing STS could use the interpretability layer to explain to users why it returned that specific score for the given sentence pair. We present a publicly available dataset of sentence pairs annotated following the formalization. We then develop an iSTS system trained on this dataset, which given a sentence pair finds what is similar and what is different, in the form of graded and typed segment alignments. When evaluated on the dataset, the system performs better than an informed baseline, showing that the dataset and task are well-defined and feasible. Most importantly, two user studies show how the iSTS system output can be used to automatically produce explanations in natural language. Users performed the two tasks better when having access to the explanations, providing preliminary evidence that our dataset and method to automatically produce explanations do help users understand the output of STS systems better.
international conference on computational linguistics | 2009
Larraitz Uria; Ainara Estarrona; Izaskun Aldezabal; María Jesús Aranzabe; Arantza Díaz de Ilarraza; Mikel Iruskieta
The aim of this work is to evaluate the dependency-based annotation of EPEC (the Reference Corpus for the Processing of Basque) by means of an experiment: two annotators have syntactically tagged a sample of the mentioned corpus in order to evaluate the agreement-rate between them and to identify those issues that have to be improved in the syntactic annotation process. In this article we present the quantitative and qualitative results of this evaluation.
north american chapter of the association for computational linguistics | 2015
Eneko Agirre; Aitor Gonzalez-Agirre; Iñigo Lopez-Gazpio; Montse Maritxalar; German Rigau; Larraitz Uria
In Semantic Textual Similarity, systems rate the degree of semantic equivalence on a graded scale from 0 to 5, with 5 being the most similar. For the English subtask, we present a system which relies on several resources for token-to-token and phrase-to-phrase similarity to build a data-structure which holds all the information, and then combine the information to get a similarity score. We also participated in the pilot on Interpretable STS, where we apply a pipeline which first aligns tokens, then chunks, and finally uses supervised systems to label and score each chunk alignment.
International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT) | 2014
Larraitz Uria; Montse Maritxalar; Igone Zabala
This article presents an environment developed for Learner Corpus Research and Error Analysis which makes it possible to deal with language errors from different points of view and with several aims. In the field of Intelligent Computer Assisted Language Learning (ICALL), our objective is to gain a better understanding of the language learning process. In the field of Natural Language Processing (NLP), we work on the development of applications that will help both language learners and teachers in their learning/teaching processes. Using this environment, several studies and experiments on error analysis have been carried out, and thanks to an in-depth study on determiner-related errors in Basque, some contributions in the above mentioned fields of research have been made.
sighum workshop on language technology for cultural heritage social sciences and humanities | 2016
Izaskun Etxeberria; Iñaki Alegria; Larraitz Uria; Mans Hulden
This paper presents a proposal for the normalization of word-forms in historical texts. To perform this task, we extend our previous research on induction of phonology and adapt it to the task of normalization. In particular, we combine our earlier models with models for learning morphology (without additional supervision). The results are mixed: induction of the segmentation of morphemes fails to directly offer significant improvements while including known morpheme boundaries in standard texts do improve results.
Procesamiento Del Lenguaje Natural | 2009
Larraitz Uria; Bertol Arrieta; Arantza Díaz de Ilarraza; Montse Maritxalar; Maite Oronoz