Is this you? Create Your Porfile

Olatz Arregi

University of the Basque Country

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Olatz Arregi is active.

Explore More

Publication

Featured researches published by Olatz Arregi.

Applied Soft Computing | 2011

A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

Ana Zelaia; Iñaki Alegria; Olatz Arregi; Basilio Sierra

Abstract: This article presents a multiclassifier approach for multiclass/multilabel document categorization problems. For the categorization process, we use a reduced vector representation obtained by SVD for training and testing documents, and a set of k-NN classifiers to predict the category of test documents; each k-NN classifier uses a reduced database subsampled from the original training database. To perform multilabeling classifications, a new approach based on Bayesian weighted voting is also presented. The good results obtained in the experiments give an indication of the potential of the proposed approach.

ibero-american conference on artificial intelligence | 2010

A first machine learning approach to pronominal anaphora resolution in Basque

Olatz Arregi; Klara Ceberio; A. Díaz de Illarraza; Iakes Goenaga; Basilio Sierra; Ana Zelaia

In this paper we present the first machine learning approach to resolve the pronominal anaphora in Basque language. In this work we consider different classifiers in order to find the system that fits best to the characteristics of the language under examination. We do not restrict our study to the classifiers typically used for this task, we have considered others, such as Random Forest or VFI, in order to make a general comparison. We determine the feature vector obtained with our linguistic processing system and we analyze the contribution of different subsets of features, as well as the weight of each feature used in the task.

north american chapter of the association for computational linguistics | 2015

IXAGroupEHUDiac: A Multiple Approach System towards the Diachronic Evaluation of Texts

Haritz Salaberri; Iker Salaberri; Olatz Arregi; Beñat Zapirain

This paper presents our contribution to the SemEval-2015 Task 7. The task was subdivided into three subtasks that consisted of automatically identifying the time period when a piece of news was written (1,2) as well as automatically determining whether a specific phrase in a sentence is relevant or not for a given period of time (3). Our system tackles the resolution of all three subtasks. With this purpose in mind multiple approaches are undertaken that use resources such as Wikipedia or Google NGrams. Final results are obtained by combining the output from all approaches. The texts used for the task are written in English and range from the years 1700 to 2000.

Engineering Applications of Artificial Intelligence | 2015

Combining Singular Value Decomposition and a multi-classifier

Ana Zelaia; Olatz Arregi; Basilio Sierra

In this paper a new machine learning approach is presented to deal with the coreference resolution task. This approach consists of a multi-classifier system that classifies mention-pairs in a reduced dimensional vector space. The vector representation for mention-pairs is generated using a rich set of linguistic features. The (Singular Value Decomposition) SVD technique is used to generate the reduced dimensional vector space. The approach is applied to the OntoNotes v4.0 Release Corpus for the column-format files used in CONLL-2011 coreference resolution shared task. The results obtained show that the reduced dimensional representation obtained by SVD is very adequate to appropriately classify mention-pair vectors. Moreover, it can be stated that the multi-classifier plays an important role in improving the results.

north american chapter of the association for computational linguistics | 2015

IXAGroupEHUSpaceEval: (X-Space) A WordNet-based approach towards the Automatic Recognition of Spatial Information following the ISO-Space Annotation Scheme

Haritz Salaberri; Olatz Arregi; Beñat Zapirain

This paper presents X-Space, a system that follows the ISO-Space annotation scheme in order to capture spatial information as well as our contribution to the SemEval-2015 task 8 (SpaceEval). Our system is the only participant system that reported results for all three evaluation configurations in SpaceEval.

Proceedings of the Eight International Conference on Computational Semantics | 2009

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

Ana Zelaia; Olatz Arregi; Basilio Sierra

In this paper a multiclassifier based approach is presented for a word sense disambiguation (WSD) problem. A vector representation is used for training and testing cases and the Singular Value Decomposition (SVD) technique is applied to reduce the dimension of the representation. The approach we present consists in creating a set of k-NN classifiers and combining the predictions generated in order to give a final word sense prediction for each case to be classified. The combination is done by applying a Bayesian voting scheme. The approach has been applied to a database of 100 words made available by the lexical sample WSD subtask of SemEval-2007 (task 17) organizers. Each of the words was considered an independent classification problem. A methodological parameter tuning phase was applied in order to optimize parameter setting for each word. Results achieved are among the best and make the approach encouraging to apply to other WSD tasks.

meeting of the association for computational linguistics | 2007

UBC-ZAS: A k-NN based Multiclassifier System to perform WSD in a Reduced Dimensional Vector Space

Ana Zelaia; Olatz Arregi; Basilio Sierra

In this article a multiclassifier approach for word sense disambiguation (WSD) problems is presented, where a set of k-NN classifiers is used to predict the category (sense) of each word. In order to combine the predictions generated by the multiclassifier, Bayesian voting is applied. Through all the classification process, a reduced dimensional vector representation obtained by Singular Value Decomposition (SVD) is used. Each word is considered an independent classification problem, and so different parameter setting, selected after a tuning phase, is applied to each word. The approach has been applied to the lexical sample WSD subtask of SemEval 2007 (task 17).

Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017) | 2017

Enriching Basque Coreference Resolution System using Semantic Knowledge sources

Ander Soraluze; Olatz Arregi; Xabier Arregi; Arantza Díaz de Ilarraza

In this paper we present a Basque coreference resolution system enriched with semantic knowledge. An error analysis carried out revealed the deficiencies that the system had in resolving coreference cases in which semantic or world knowledge is needed. We attempt to improve the deficiencies using two semantic knowledge sources, specifically Wikipedia and WordNet.

north american chapter of the association for computational linguistics | 2016

Coreference Resolution for the Basque Language with BART.

Ander Soraluze; Olatz Arregi; Xabier Arregi; Arantza Díaz de Ilarraza; Mijail A. Kabadjov; Massimo Poesio

In this paper we present our work on Coreference Resolution in Basque, a unique language which poses interesting challenges for the problem of coreference. We explain how we extend the coreference resolution toolkit, BART, in order to enable it to process Basque. Then we run four different experiments showing both a significant improvement by extending a baseline feature set and the effect of calculating performance of hand-parsed mentions vs. automatically parsed mentions. Finally, we discuss some key characteristics of Basque which make it particularly challenging for coreference and draw a road map for future work.

north american chapter of the association for computational linguistics | 2015

A Multi-classifier Approach to support Coreference Resolution in a Vector Space Model

Ana Zelaia; Olatz Arregi; Basilio Sierra

In this paper a different machine learning approach is presented to deal with the coreference resolution task. This approach consists of a multi-classifier system that classifies mention-pairs in a reduced dimensional vector space. The vector representation for mentionpairs is generated using a rich set of linguistic features. The SVD technique is used to generate the reduced dimensional vector space. The approach is applied to the OntoNotes v4.0 Release Corpus for the column-format files used in CONLL-2011 coreference resolution shared task. The results obtained show that the reduced dimensional representation obtained by SVD is very adequate to appropriately classify mention-pair vectors. Moreover, we can state that the multi-classifier plays an important role in improving the results.

Explore More