Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sonja E. Bosch is active.

Publication


Featured researches published by Sonja E. Bosch.


Machine Translation | 2003

Finite-State Computational Morphology: An Analyzer Prototype For Zulu

Laurette Pretorius; Sonja E. Bosch

As one of the largest of the 11 official languages of South Africa, Zulu is spoken by approximately 9 million people. It forms part of a language family which is characterized by rich agglutinating morphological structures. This paper discusses a prototype of a computational morphological analyzer for Zulu, built by means of the Xerox finite state tools, in particular lexc and xfst. In addition to considering both the morphotactics and the morphophonological alternation rules that apply, the focus is on implementation and other issues that need to be resolved in order to produce a useful software artefact for automated morphological analysis. The current status of the prototype is alluded to by providing morphological scope, that is the various word categories (parts of speech) that may be handled, and the lexical coverage in terms of the number of different Zulu roots that are included in the embedded lexicon of the analyzer. Preliminary testing and validation procedures are briefly discussed.


Southern African Linguistics and Applied Language Studies | 2008

Containing overgeneration in Zulu computational morphology

Laurette Pretorius; Sonja E. Bosch

The development of a large-coverage, computational morphological analyser for Zulu requires the modelling not only of the regular phenomena often associated with word formation, but also the idiosyncratic behaviour that may occur in Zulu morphology. This paper discusses the application of an existing rule-based, finite-state morphological analyser prototype ZulMorph in semi-automating the mining of available Zulu language corpora for idiosyncratic behaviour. The semi-automated procedure makes provision for bootstrapping the morphological analyser to include newly extracted information from corpora. Of particular interest is also the central role that the machine-readable lexicon plays. The procedure is applied to a Zulu development corpus of 30 000 types and the results are given and discussed.


South African journal of african languages | 2005

The Effectiveness of morphological rules for an isiZulu spelling checker

Sonja E. Bosch; Roald Eiselen

This paper shows how morphological analysis contributes to solving the challenges posed by the development of a spelling checker for an agglutinative language like isiZulu. It demonstrates how the incremental implementation of affix removal rules can be used to derive word forms and enhance the lexical and error recall of the system. In the case of the spelling checker the strategies used are mainly based on the use of regular expressions, and more specifically on a process of stemming.


South African journal of african languages | 2002

‘Abbreviated nouns’ in African languages: a morphological, semantic and lexicographic perspective

Sonja E. Bosch; D. J. Prinsloo

This article focuses on morphological and semantic analysis as well as lexicographic treatment of a specific type of compound noun whose initial part is a so-called ‘abbreviated noun’, which in the case of Zulu is -so- or -no- and in Sepedi is ra- or ma-. It will be argued that these nouns have become grammaticalised forms which have lost their status as fully fledged lexical items, and have, through metaphorical usage and a subsequent process of desemanticisation, been reanalysed as grammatical units used productively to coin new words. A semantic continuum will be postulated representing the semantic range, or even shift, of these abbreviated nouns from the original meanings ‘father/mother of, as one extreme through’ owner of or ‘having special skills/characteristics’ to ‘occupation’ as the other extreme. Finally, various suggestions will be made for the alternative lexicographic treatment of transparent versus non-transparent forms, according to a transparency continuum.


finite state methods and natural language processing | 2009

Finite state morphology of the nguni language cluster: modelling and implementation issues

Laurette Pretorius; Sonja E. Bosch

The paper provides an overview of a project on computational morphological analysers for the Nguni cluster of languages namely Zulu, Xhosa, Swati and Ndebele. These languages are agglutinative and lesser-resourced. The project adopted a finite approach, which is wellsuited to modelling both regular morphophonological phenomena and linguistic idiosyncrasies. The paper includes a brief overview of the morphology of this cluster of languages, then focuses on how the various morphophonological phenomena of Zulu are modelled and implemented using the Xerox finite-state toolkit. The bootstrapping of the Zulu morphological analyser prototype, ZulMorph, to obtain analyser prototypes for Xhosa, Swati and Ndebele, is outlined and experimental results given.


Southern African Linguistics and Applied Language Studies | 2007

African languages — is the writing on the screen?

Sonja E. Bosch

The trends emerging in the natural language processing (NLP) of African languages spoken in South Africa, are explored in order to determine whether research in and development of such NLP is keeping abreast of international developments. This is done by investigating the past, present and future of NLP of African languages, keeping especially the multidisciplinary nature of the field and the role of the linguist in mind. A Human Sciences Research Council (HSRC) report of 1986, expressed concern about the backlog in South Africa regarding NLP, and called for dynamic action. As computational power increased and became less expensive, more interest began to be shown in NLP in South Africa. Pockets of expertise that have developed at various institutions over the past 20 years are discussed and the importance of cooperation in the field, across disciplines, is illustrated in this paper. In order to facilitate coordinated action and prevent the duplication of language resources and the development of basic enabling technologies, the implementation of the concept of the Basic Language Resource Kit (BLARK) is recommended, while a new project, which aims to create a platform for WordNet development for African languages, is cited as prime example of international collaboration.


South African journal of african languages | 1997

Possible origins of the possessive particle -ka- in Zulu.

Sonja E. Bosch

The aim of this article is to trace some of the possible origins of the possessive particle -ka- in Zulu, which occurs when the possessor noun is a noun of class 1 a. Research done so far on the nature of -ka--, does not provide conclusive evidence on the exact origins of this particle. Nevertheless, several postulations regarding the origins of -ka- have been made, such as the possibility that -ka- is a contracted form of the locative class possessive concord kwa-; or that -ka- originates from a class prefix. These hypotheses are critically evaluated, and arguments are put forward for yet another hypothesis, namely that -ka- possibly developed from a lexical item by means of a process of semantic broadening. It is further demonstrated that the process of grammaticalization also plays a role in the postulated derivation of this particle.


South African journal of african languages | 1988

Aspects of subject conjunction in Zulu

Sonja E. Bosch

Bantu languages have a number of grammatical genders which are paired into singular/plural classes. The purpose of this article is to find solutions for some of the agreement problems arising from the conjunction of two or more nouns. The research was carried out against the background of the gender conflict resolution rules as proposed by various Bantu linguists. It was found that concord selection is influenced by the semantic content of the conjoined nouns as well as by discourse notions such as presupposition. Accordingly discourse context plays an indispensable role in the interpretation of linguistic data in this study.


South African journal of african languages | 2005

Development of reusable resources for Human Language Technologies (HLT) applications: practice and experience

Jackie Jones; Sonja E. Bosch; Laurette Pretorius; D. J. Prinsloo

Language resources, by their very nature, serve as a repository of linguistic knowledge. They are therefore essential in the building and improvement of natural language applications. The aim of this paper is to elaborate on the practice and the experience gained in the development, maintenance and management of such resources with specific reference to African languages. The focus is on the methods of collection and the formats concerning word lists, morphological analysis and lemma lists. The resources discussed, are those developed in collaborative research with North-West Universitys Spelling Checker Project. As a broader perspective, the reusability of such resources is highlighted. Recommendations are also made regarding the way forward nationally in developing a resource centre to facilitate the technological development of South African Bantu languages.


South African journal of african languages | 2002

The significance of computational morphological for Zulu lexicography

Sonja E. Bosch; Laurette Pretorius

With regard to the significance of computational morphological analysis for Zulu, and more specifically for Zulu lexicography, the aims of this article are the following: Firstly, we discuss computational morphological analysis as an enabling technology in the field of natural language processing (NLP) with specific reference to lexicography. Secondly, we explain how a computational morphological analyser for Zulu is built by using the Xerox finite-state tools. Thirdly, we demonstrate the application of such an analyser by means of examples from Zulu. Finally, we argue that in Zulu lexicography, a computational morphological analyser is an indispensable tool for facilitating some of the basic outputs of electronic corpora.

Collaboration


Dive into the Sonja E. Bosch's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jackie Jones

University of South Africa

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gertrud Faaß

University of South Africa

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Axel Fleisch

University of South Africa

View shared research outputs
Top Co-Authors

Avatar

George Poulos

University of South Africa

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge