Is this you? Create Your Porfile

Itziar Aldabe

University of the Basque Country

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Itziar Aldabe is active.

Explore More

Publication

Featured researches published by Itziar Aldabe.

intelligent tutoring systems | 2006

ArikIturri: an automatic question generator based on corpora and NLP techniques

Itziar Aldabe; Maddalen Lopez de Lacalle; Montse Maritxalar; Edurne Martinez; Larraitz Uria

Knowledge construction is expensive for Computer Assisted Assessment. When setting exercise questions, teachers use Test Makers to construct Question Banks. The addition of Automatic Generation to assessment applications decreases the time spent on constructing examination papers. In this article, we present ArikIturri, an Automatic Question Generator for Basque language test questions, which is independent from the test assessment application that uses it. The information source for this question generator consists of linguistically analysed real corpora, represented in XML mark-up language. ArikIturri makes use of NLP tools. The influence of the robustness of those tools and the used corpora is highlighted in the article. We have proved the viability of ArikIturri when constructing fill-in-the-blank, word formation, multiple choice, and error correction question types. In the evaluation of this automatic generator, we have obtained positive results as regards the generation process and its usefulness.

north american chapter of the association for computational linguistics | 2015

SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering

Anne-Lyse Minard; Manuela Speranza; Eneko Agirre; Itziar Aldabe; Marieke van Erp; Bernardo Magnini; German Rigau; Ruben Urizar

This paper describes the outcomes of the TimeLine task (Cross-Document Event Ordering), that was organised within the Time and Space track of SemEval-2015. Given a set of documents and a set of target entities, the task consisted of building a timeline for each entity, by detecting, anchoring in time and ordering the events involving that entity. The TimeLine task goes a step further than previous evaluation challenges by requiring participant systems to perform both event coreference and temporal relation extraction across documents. Four teams submitted the output of their systems to the four proposed subtracks for a total of 13 runs, the best of which obtained an F1-score of 7.85 in the main track (timeline creation from raw text).

international joint conference on natural language processing | 2015

Document Level Time-anchoring for TimeLine Extraction

Egoitz Laparra; Itziar Aldabe; German Rigau

This paper investigates the contribution of document level processing of timeanchors for TimeLine event extraction. We developed and tested two different systems. The first one is a baseline system that captures explicit time-anchors. The second one extends the baseline system by also capturing implicit time relations. We have evaluated both approaches in the SemEval 2015 task 4 TimeLine: CrossDocument Event Ordering. We empirically demonstrate that the document-based approach obtains a much more complete time anchoring. Moreover, this approach almost doubles the performance of the systems that participated in the task.

Proceedings of the First Workshop on Computing News Storylines | 2015

From TimeLines to StoryLines: A preliminary proposal for evaluating narratives

Egoitz Laparra; Itziar Aldabe; German Rigau

We formulate a proposal that covers a new definition of StoryLines based on the shared data provided by the NewsStory workshop. We re-use the SemEval 2015 Task 4: Timelines dataset to provide a gold-standard dataset and an evaluation measure for evaluating StoryLines extraction systems. We also present a system to explore the feasibility of capturing StoryLines automatically. Finally, based on our initial findings, we also discuss some simple changes that will improve the existing annotations to complete our initial Story-

IEEE Transactions on Learning Technologies | 2014

Semantic Similarity Measures for the Generation of Science Tests in Basque

Itziar Aldabe; Montse Maritxalar

The work we present in this paper aims to help teachers create multiple-choice science tests. We focus on a scientific vocabulary-learning scenario taking place in a Basque-language educational environment. In this particular scenario, we explore the option of automatically generating Multiple-Choice Questions (MCQ) by means of Natural Language Processing (NLP) techniques and the use of corpora. More specifically, human experts select scientific articles and identify the target terms (i.e., words). These terms are part of the vocabulary studied in the school curriculum for 13-14-year-olds and form the starting point for our system to generate MCQs. We automatically generate distractors that are similar in meaning to the target term. To this end, the system applies semantic similarity measures making use of a variety of corpus-based and graph-based approaches. The paper presents a qualitative and a quantitative analysis of the generated tests to measure the quality of the proposed methods. The qualitative analysis is based on expert opinion, whereas the quantitative analysis is based on the MCQ test responses from students in secondary school. Nine hundred and fifty one students from 18 schools took part in the experiments. The results show that our system could help experts in the generation of MCQ.

language resources and evaluation | 2016

Predicate Matrix: automatically extending the semantic interoperability between predicate resources

Maddalen Lopez de Lacalle; Egoitz Laparra; Itziar Aldabe; German Rigau

This paper presents a novel approach to improve the interoperability between four semantic resources that incorporate predicate information. Our proposal defines a set of automatic methods for mapping the semantic knowledge included in WordNet, VerbNet, PropBank and FrameNet. We use advanced graph-based word sense disambiguation algorithms and corpus alignment methods to automatically establish the appropriate mappings among their lexical entries and roles. We study different settings for each method using SemLink as a gold-standard for evaluation. The results show that the new approach provides productive and reliable mappings. In fact, the mappings obtained automatically outnumber the set of original mappings in SemLink. Finally, we also present a new version of the Predicate Matrix, a lexical-semantic resource resulting from the integration of the mappings obtained by our automatic methods and SemLink.

international conference on advanced learning technologies | 2007

The Question Model inside ArikIturri

Itziar Aldabe; M.L. de Lacalle; Montse Maritxalar; Edurne Martinez

With the aim of facilitating the automatic acquisition of some didactic resources, i. e. exercises for Computer Assisted Assessment Applications, we have implemented Arikiturri. Using corpora as source data, the system is able to automatically generate questions about domain contents. We use Natural Language Processing (NLP) techniques to facilitate this generation. In this article we describe the generic question model underlying Arikiturri. The model is independent from the language of the source corpora.

Knowledge Based Systems | 2017

Multi-lingual and Cross-lingual timeline extraction

Egoitz Laparra; Rodrigo Agerri; Itziar Aldabe; German Rigau

Abstract In this paper we present an approach to extract ordered timelines of events, their participants, locations and times from a set of Multilingual and Cross-lingual data sources. Based on the assumption that event-related information can be recovered from different documents written in different languages, we extend the Cross-document Event Ordering task presented at SemEval 2015 by specifying two new tasks for, respectively, Multilingual and Cross-lingual timeline extraction. We then develop three deterministic algorithms for timeline extraction based on two main ideas. First, we address implicit temporal relations at document level since explicit time-anchors are too scarce to build a wide coverage timeline extraction system. Second, we leverage several multilingual resources to obtain a single, interoperable, semantic representation of events across documents and across languages. The result is a highly competitive system that strongly outperforms the current state-of-the-art. Nonetheless, further analysis of the results reveals that linking the event mentions with their target entities and time-anchors remains a difficult challenge. The systems, resources and scorers are freely available to facilitate its use and guarantee the reproducibility of results.

workshop on events definition detection coreference and representation | 2015

Semantic Interoperability for Cross-lingual and cross-document Event Detection

Piek Vossen; Egoitz Laparra; German Rigau; Itziar Aldabe

We describe a system for event extraction across documents and languages. We developed a framework for the interoperable semantic interpretation of mentions of events, participants, locations and time, as well as the relations between them. Furthermore, we use a common RDF model to represent instances of events and normalised entities and dates. We convert multiple mentions of the same event in English, Spanish and Dutch to a single representation. We thus resolve crossdocument event and entity coreference within a language but also across languages. We tested our system on a Wikinews corpus of 120 English articles that have been manually translated to Spanish and Dutch. We report on the cross-lingual cross-document event and entity extraction comparing the Spanish and Dutch output with respect to English.

artificial intelligence in education | 2015

Domain Module Building From Textbooks: Integrating Automatic Exercise Generation

Itziar Aldabe; Mikel Larrañaga; Montse Maritxalar; Ana Arruarte; Jon A. Elorriaga

DOM-Sortze is a framework for the semiautomatic generation of Domain Modules from textbooks. It identifies not only topics and relationships between topics but also Learning Objects (e.g., definitions, examples, problem-statements) included in an electronic document. ArikIturri is a NLP-based system designed to automatically generate test-based exercises from corpora. To enrich the Learning Object Repository of DOM-Sortze with new test-based exercises, both systems have been integrated. The experiment conducted to verify the validity of the proposal is described throughout the paper.

Explore More