Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ineke Schuurman is active.

Publication


Featured researches published by Ineke Schuurman.


Spyns, P.;Odijk, J. (ed.), Essential Speech and Language Technology for Dutch | 2013

The Construction of a 500-Million-Word Reference Corpus of Contemporary Written Dutch

Nelleke Oostdijk; Martin Reynaert; Veronique Hoste; Ineke Schuurman

The construction of a large and richly annotated corpus of written Dutch was identified as one of the priorities of the STEVIN programme. Such a corpus, sampling texts from conventional and new media, is invaluable for scientific research and application development. The present chapter describes how in two consecutive STEVIN-funded projects, viz. D-Coi and SoNaR, the Dutch reference corpus was developed. The construction of the corpus has been guided by (inter)national standards and best practices. At the same time through the achievements and the experiences gained in the D-Coi and SoNaR projects, a contribution was made to their further advancement and dissemination.


Essential Speech and Language Technology for Dutch | 2013

Large Scale Syntactic Annotation of Written Dutch: Lassy

Gertjan van Noord; Gosse Bouma; Frank Van Eynde; Daniël de Kok; Jelmer van der Linde; Ineke Schuurman; Erik F. Tjong Kim Sang; Vincent Vandeghinste

This chapter presents the Lassy Small and Lassy Large treebanks, as well as related tools and applications. Lassy Small is a corpus of written Dutch texts (1,000,000 words) which has been syntactically annotated with manual verification and correction. Lassy Large is a much larger corpus (over 500,000,000 words) which has been syntactically annotated fully automatically. In addition, various browse and search tools for syntactically annotated corpora have been developed and made available. Their potential for applications in corpus linguistics and information extraction has been illustrated and evaluated in a series of case studies.


Machine Translation | 2008

METIS-II: low resource machine translation

Michael Carl; Maite Melero; Toni Badia; Vincent Vandeghinste; Peter Dirix; Ineke Schuurman; Stella Markantonatou; Sokratis Sofianopoulos; Marina Vassiliou; Olga Yannoutsou

METIS-II was an EU-FET MT project running from October 2004 to September 2007, which aimed at translating free text input without resorting to parallel corpora. The idea was to use “basic” linguistic tools and representations and to link them with patterns and statistics from the monolingual target-language corpus. The METIS-II project has four partners, translating from their “home” languages Greek, Dutch, German, and Spanish into English. The paper outlines the basic ideas of the project, their implementation, the resources used, and the results obtained. It also gives examples of how METIS-II has continued beyond its lifetime and the original scope of the project. On the basis of the results and experiences obtained, we believe that the approach is promising and offers the potential for development in various directions.


natural language generation | 2015

Natural Language Generation from Pictographs

Leen Sevens; Vincent Vandeghinste; Ineke Schuurman; Frank Van Eynde

We present a Pictograph-to-Text translation system for people with Intellectual or Developmental Disabilities (IDD). The system translates pictograph messages, consisting of one or more pictographs, into Dutch text using WordNet links and an ngram language model. We also provide several pictograph input methods assisting the users in selecting the appropriate pictographs.


Natural Language Engineering | 2017

Translating text into pictographs

Vincent Vandeghinste; Ineke Schuurman; Leen Sevens; Frank Van Eynde

We describe and evaluate a text-to-pictograph translation system that is used in an online platform for Augmentative and Alternative Communication, which is intended for people who are not able to read and write, but who still want to communicate with the outside world. The system is set up to translate from Dutch into Sclera and Beta, two publicly available pictograph sets consisting of several thousands of pictographs each. We have linked large amounts of these pictographs to synsets or combinations of synsets of Cornetto, a lexical-semantic database for Dutch similar to WordNet. In the translation system, the Dutch input text undergoes shallow linguistic analysis and the synsets of the content words are looked up. The system looks for the nearest pictographs in the lexical-semantic database and displays the message into pictographs. We evaluated the system and results showed a large improvement over the baseline system which consisted of straightforward string-matching between the input text and the filenames of the pictographs. Our system provides a clear improvement in the communication possibilities of illiterate people. Nevertheless there is room for further improvement.


The People's Web Meets NLP: Collaboratively Constructed Language Resources | 2013

Community efforts around the ISOcat Data Category Registry

Sue Ellen Wright; Menzo Windhouwer; Ineke Schuurman; Marc Kemps-Snijders

The ISOcat Data Category Registry provides a community computing environment for creating, storing, retrieving, harmonizing and standardizing data category specifications (DCs), used to register linguistic terms used in various fields. This chapter recounts the history of DC documentation in TC 37, beginning from paper-based lists created for lexicographers and terminologists and progressing to the development of a web-based resource for a much broader range of users. While describing the considerable strides that have been made to collect a very large comprehensive collection of DCs, it also outlines difficulties that have arisen in developing a fully operative web-based computing environment for achieving consensus on data category names, definitions, and selections and describes efforts to overcome some of the present shortcomings and to establish positive working procedures designed to engage a wide range of people involved in the creation of language resources.


ieee international conference semantic computing | 2011

Spatiotemporal Annotation: Interaction between Standards and other Formats

Ineke Schuurman; Vincent Vandeghinste

Standards and the need for standards, for example for annotation purposes, only emerge after a period of time. Before, people just did what they thought was right. This may have resulted in large amounts of data in a format that in the end did not turn out to be on speaking terms with the (new) standard. This format may even have become a de facto standard for a particular language or in a particular domain. In this paper we discuss an approach for situations in which ISOcat is used to mediate between such formats. Another task for ISOcat is to indicate the possible re-use of the output of semantic annotation X using format Y for a new annotation Z. These possibilities are to a large extent determined by the compatibility of the (definitions of the) data categories used in both. The spatiotemporal annotation schema STEx, as used in the SoNaR-corpus, is central to this paper. Its input consists of other (semantic) annotations. In the TTNWW-project1 STEx is related to relevant standards, like ISO-Time ML, and state-of-the-art formats, like Spatial ML. We describe which conditions should be met and how ISOcat can offer a helping hand.


joint conference on lexical and computational semantics | 2016

Improving Text-to-Pictograph Translation Through Word Sense Disambiguation

Leen Sevens; Gilles Jacobs; Vincent Vandeghinste; Ineke Schuurman; Frank Van Eynde

We describe the implementation of a Word Sense Disambiguation (WSD) tool in a Dutch Text-to-Pictograph translation system, which converts textual messages into sequences of pictographic images. The system is used in an online platform for Augmentative and Alternative Communication (AAC). In the original translation process, the appropriate sense of a word was not disambiguated before converting it into a pictograph. This often resulted in incorrect translations. The implementation of a WSD tool provides a better semantic understanding of the input messages.


conference of the international speech communication association | 2015

Extending a Dutch Text-to-Pictograph Converter to English and Spanish

Leen Sevens; Vincent Vandeghinste; Ineke Schuurman; Frank Van Eynde

We describe how a Dutch Text-to-Pictograph translation system, designed to augment written text for people with Intellectual or Developmental Disabilities (IDD), was adapted in order to be usable for English and Spanish. The original system has a language-independent design. As far as the textual part is concerned, it is adaptable to all natural languages for which interlingual WordNet [1] links, lemmatizers and part-of-speech taggers are available. As far as the pictographic part is concerned, it can be modified for various pictographic languages. The evaluations show that our results are in line with the performance of the original Dutch system. Text-to-Pictograph translation has a wide application potential in the domain of Augmentative and Alternative Communication (AAC). The system will be released as an open source product. Index Terms: Augmentative and Alternative Communication, Pictographic Languages, Text-to-Pictograph Translation


data and knowledge engineering | 2018

Less is more: A rule-based syntactic simplification module for improved text-to-pictograph translation

Leen Sevens; Vincent Vandeghinste; Ineke Schuurman; Frank Van Eynde

Abstract In order to enable or facilitate online communication for people with an intellectual disability, the Text-to-Pictograph translation system automatically translates Dutch written text into a series of Sclera or Beta pictographs. The baseline system presents the reader with a more or less verbatim pictograph-per-word translation. As a result, long and complex input sentences lead to long and complex pictograph translations, leaving the end users confused and distracted. To overcome these problems, we developed a rule-based simplification system for Dutch Text-to-Pictograph translation. By using recursion and applying the simplification operations in a logical way, only one syntactic parse is needed per message. Promising results are obtained.

Collaboration


Dive into the Ineke Schuurman's collaboration.

Top Co-Authors

Avatar

Vincent Vandeghinste

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Frank Van Eynde

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Leen Sevens

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter Dirix

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gosse Bouma

University of Groningen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Liesbeth Augustinus

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge