Montse Maritxalar
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Montse Maritxalar.
north american chapter of the association for computational linguistics | 2015
Eneko Agirre; Carmen Banea; Claire Cardie; Daniel M. Cer; Mona T. Diab; Aitor Gonzalez-Agirre; Weiwei Guo; Iñigo Lopez-Gazpio; Montse Maritxalar; Rada Mihalcea; German Rigau; Larraitz Uria; Janyce Wiebe
In semantic textual similarity (STS), systems rate the degree of semantic equivalence between two text snippets. This year, the participants were challenged with new datasets in English and Spanish. The annotations for both subtasks leveraged crowdsourcing. The English subtask attracted 29 teams with 74 system runs, and the Spanish subtask engaged 7 teams participating with 16 system runs. In addition, this year we ran a pilot task on interpretable STS, where the systems needed to add an explanatory layer, that is, they had to align the chunks in the sentence pair, explicitly annotating the kind of relation and the score of the chunk pair. The train and test data were manually annotated by an expert, and included headline and image sentence pairs from previous years. 7 teams participated with 29 runs.
intelligent tutoring systems | 2006
Itziar Aldabe; Maddalen Lopez de Lacalle; Montse Maritxalar; Edurne Martinez; Larraitz Uria
Knowledge construction is expensive for Computer Assisted Assessment. When setting exercise questions, teachers use Test Makers to construct Question Banks. The addition of Automatic Generation to assessment applications decreases the time spent on constructing examination papers. In this article, we present ArikIturri, an Automatic Question Generator for Basque language test questions, which is independent from the test assessment application that uses it. The information source for this question generator consists of linguistically analysed real corpora, represented in XML mark-up language. ArikIturri makes use of NLP tools. The influence of the robustness of those tools and the used corpora is highlighted in the article. We have proved the viability of ArikIturri when constructing fill-in-the-blank, word formation, multiple choice, and error correction question types. In the evaluation of this automatic generator, we have obtained positive results as regards the generation process and its usefulness.
international conference natural language processing | 2010
Itziar Aldabe; Montse Maritxalar
This paper presents a system which uses Natural Language Processing techniques to generate multiple-choice questions. The system implements different methods to find distractors semantically similar to the correct answer. For this task, a corpus-based approach is applied to measure similarities. The target language is Basque and the questions are used for learners assessment in the science domain. In this article we present the results of an evaluation carried out with learners to measure the quality of the automatically generated distractors.
north american chapter of the association for computational linguistics | 2016
Eneko Agirre; Aitor Gonzalez-Agirre; Iñigo Lopez-Gazpio; Montse Maritxalar; German Rigau; Larraitz Uria
Comunicacio presentada al 10th International Workshop on Semantic Evaluation (SemEval-2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, California.
conference of the european chapter of the association for computational linguistics | 1993
Itziar Aduriz; Eneko Agirre; Iñaki Alegria; Xabier Arregi; Jose Mari Arriola; Xabier Artola; A. Díaz de Ilarraza; Nerea Ezeiza; Montse Maritxalar; Kepa Sarasola; Miriam Urkia
Xuxen is a spelling checker/corrector for Basque which is going to be comercialized next year. The checker recognizes a word-form if a correct morphological breakdown is allowed. The morphological analysis is based on two-level morphology. The correction method distinguishes between orthographic errors and typographical errors. • Typographical errors (or misstypings) are uncognitive errors which do not follow linguistic criteria. • Orthographic errors are cognitive errors which occur when the writer does not know or has forgotten the correct spelling for a word. They are more persistent because of their cognitive nature, they leave worse impression and, finally, its treatment is an interesting application for language standardization purposes.
Knowledge Based Systems | 2017
Iñigo Lopez-Gazpio; Montse Maritxalar; Aitor Gonzalez-Agirre; German Rigau; Larraitz Uria; Eneko Agirre
We address interpretability, the ability of machines to explain their reasoning.We formalize it for textual similarity as graded typed alignment between 2 sentences.We release an annotated dataset and build and evaluate a high performance system.We show that the output of the system can be used to produce explanations.2 user studies show preliminary evidence that explanations help humans perform better. User acceptance of artificial intelligence agents might depend on their ability to explain their reasoning to the users. We focus on a specific text processing task, the Semantic Textual Similarity task (STS), where systems need to measure the degree of semantic equivalence between two sentences. We propose to add an interpretability layer (iSTS for short) formalized as the alignment between pairs of segments across the two sentences, where the relation between the segments is labeled with a relation type and a similarity score. This way, a system performing STS could use the interpretability layer to explain to users why it returned that specific score for the given sentence pair. We present a publicly available dataset of sentence pairs annotated following the formalization. We then develop an iSTS system trained on this dataset, which given a sentence pair finds what is similar and what is different, in the form of graded and typed segment alignments. When evaluated on the dataset, the system performs better than an informed baseline, showing that the dataset and task are well-defined and feasible. Most importantly, two user studies show how the iSTS system output can be used to automatically produce explanations in natural language. Users performed the two tasks better when having access to the explanations, providing preliminary evidence that our dataset and method to automatically produce explanations do help users understand the output of STS systems better.
north american chapter of the association for computational linguistics | 2015
Eneko Agirre; Aitor Gonzalez-Agirre; Iñigo Lopez-Gazpio; Montse Maritxalar; German Rigau; Larraitz Uria
In Semantic Textual Similarity, systems rate the degree of semantic equivalence on a graded scale from 0 to 5, with 5 being the most similar. For the English subtask, we present a system which relies on several resources for token-to-token and phrase-to-phrase similarity to build a data-structure which holds all the information, and then combine the information to get a similarity score. We also participated in the pilot on Interpretable STS, where we apply a pipeline which first aligns tokens, then chunks, and finally uses supervised systems to label and score each chunk alignment.
ibero american conference on ai | 2008
Daniel Castro-Castro; Rocío Lannes-Losada; Montse Maritxalar; Ianire Niebla; Celia Pérez-Marqués; Nancy C. Álamo-Suárez; Aurora Pons-Porrata
In this paper, we present a text evaluation system for students to improve Basque or Spanish writing skills. The system uses Natural Language Processing techniques to evaluate essays by detecting specific measures. The application uses a client-server architecture and both the interface and the application itself are multilingual. The article also explains how the system can be adapted to evaluate Spanish essays written in Cuban schools.
meeting of the association for computational linguistics | 2006
Iñaki Alegria; Bertol Arrieta; Arantza Díaz de Ilarraza; Eli Izagirre; Montse Maritxalar
In this paper, we describe the research using machine learning techniques to build a comma checker to be integrated in a grammar checker for Basque. After several experiments, and trained with a little corpus of 100,000 words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%. It also gets a precision of 70% and a recall of 49% in the task of placing commas. Finally, we have shown that these results can be improved using a bigger and a more homogeneous corpus to train, that is, a bigger corpus written by one unique author.
IEEE Transactions on Learning Technologies | 2014
Itziar Aldabe; Montse Maritxalar
The work we present in this paper aims to help teachers create multiple-choice science tests. We focus on a scientific vocabulary-learning scenario taking place in a Basque-language educational environment. In this particular scenario, we explore the option of automatically generating Multiple-Choice Questions (MCQ) by means of Natural Language Processing (NLP) techniques and the use of corpora. More specifically, human experts select scientific articles and identify the target terms (i.e., words). These terms are part of the vocabulary studied in the school curriculum for 13-14-year-olds and form the starting point for our system to generate MCQs. We automatically generate distractors that are similar in meaning to the target term. To this end, the system applies semantic similarity measures making use of a variety of corpus-based and graph-based approaches. The paper presents a qualitative and a quantitative analysis of the generated tests to measure the quality of the proposed methods. The qualitative analysis is based on expert opinion, whereas the quantitative analysis is based on the MCQ test responses from students in secondary school. Nine hundred and fifty one students from 18 schools took part in the experiments. The results show that our system could help experts in the generation of MCQ.