Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joseba Abaitua is active.

Publication


Featured researches published by Joseba Abaitua.


meeting of the association for computational linguistics | 1998

Bitext Correspondences through Rich Mark-up

Raquel Martínez; Joseba Abaitua; Arantza Casillas

Rich mark-up can considerably benefit the process of establishing bitext correspondences, that is, the task of providing correct identification and alignment methods for text segments that are translation equivalences of each other in a parallel corpus. We present a sentence alignment algorithm that, by taking advantage of previously annotated texts, obtains accuracy rates close to 100%. The algorithm evaluates the similarity of the linguistic and extralinguistic mark-up in both sides of a bitext. Given that annotations are neutral with respect to typological, grammatical and orthographical differences between languages, rich mark-up becomes an optimal foundation to support bitext correspondences. The main originality of this approach is that it makes maximal use of annotations, which is a very sensible and efficient method for the exploitation of parallel corpora when annotations exist.


international conference on natural language generation | 2000

DTD-driven bilingual document generation

Arantza Casillas; Joseba Abaitua; Raquel Martínez

Extensively annotated bilingual parallel corpora can be exploited to feed editing tools that integrate the processes of document composition and translation. Here we discuss the architecture of an interactive editing tool that, on top of techniques common to most Translation Memory-based systems, applies the potential of SGMLs DTDs to guide the process of bilingual document generation. Rather than employing just simple task-oriented mark-up, we selected a set of TEIs highly complex and versatile collection of tags to help disclose the underlying logical structure of documents in the test-corpus. DTDs were automatically induced and later integrated in the editing tool to provide the basic scheme for new documents.


international conference on computational linguistics | 2002

Cascading XSL filters for content selection in multilingual document generation

Guillermo Barrutieta; Joseba Abaitua; Josuka Díaz

Content selection is a key factor of any successful document generation system. This paper shows how a content selection algorithm has been implemented using an efficient combination of XML/XSL technology and the framework of RST for discourse modeling. The system generates multilingual documents adapted to user profiles in a learning environment for the web. This CourseViewGenerator applies simplified RST schemes to the elaboration of a master document in XML from which content segments are chosen to suit the users needs. The personalisation of the document is achieved through the application of a sequence of filtering levels of text selection based on the user aspects given as input. These cascading filters are implemented in XSL.


conference of the association for machine translation in the americas | 2000

Recycling Annotated Parallel Corpora for Bilingual Document Composition

Arantza Casillas; Joseba Abaitua; Raquel Martínez

Parallel corpora enriched with descriptive annotations facilitate multilingual authoring development. Departing from an annotated bitext we show how SGML markup can be recycled to produce complementary language resources. On the one hand, several translation memory databases together with glossaries of proper nouns have been produced. On the other, DTDs for source and target documents have been derived and put into correspondence. This paper discusses how these resources have been automatically generated and applied to an interactive bilingual authoring system. This tool is capable of handling a substantial proportion of text both in the composition and translation of structured documents.


Perspectives-studies in Translatology | 1999

Quince años de traducción automática en españa

Joseba Abaitua

Abstract Machine translation is fifteen years old in Spain. Research has gone through three major stages. In 1985 a sudden outbreak of interest appeared in Spain as three transnational companies and the European Community funded the creation of several research groups. Paradoxically, 1992, which was a widely celebrated year in Spain (owing to the 5th centennial of the discovery of America and the Olympic Games held in Barcelona), marked the end of that dynamic period. At this point the methods and aims of the field were reconsidered and funding was dramatically cut. Since 1995, the growing globalization of the economy, the boom of Internet and the demand for multilingual documentation and software has renewed the interest in translation technology.


meeting of the association for computational linguistics | 1998

Aligning tagged bitexts

Raquel Martínez-Unanue; Joseba Abaitua; Arantza Casillas


Archive | 2002

User modelling and content selection for multilingual document generation

Guillermo Barrutieta; Joseba Abaitua


Procesamiento Del Lenguaje Natural | 1999

Extracción y aprovechamiento de DTDs emparejadas en corpus paralelos.

Arantza Casillas; Joseba Abaitua; Raquel Martínez


Archive | 2001

Gross-grained RST through XML Metadata for Multilingual Document Generation

Guillermo Barrutieta; Joseba Abaitua; Josuka Díaz


Procesamiento Del Lenguaje Natural | 1997

Segmentación de corpus paralelos para memorias de traducción.

Joseba Abaitua; Arantza Casilla; Raquel Martínez

Collaboration


Dive into the Joseba Abaitua's collaboration.

Top Co-Authors

Avatar

Arantza Casillas

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Raquel Martínez

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge