Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alta de Waal is active.

Publication


Featured researches published by Alta de Waal.


Speech Communication | 2014

A smartphone-based ASR data collection tool for under-resourced languages

Nic J. de Vries; Marelie H. Davel; Jaco Badenhorst; Willem D. Basson; Febe de Wet; Etienne Barnard; Alta de Waal

Acoustic data collection for automatic speech recognition (ASR) purposes is a particularly challenging task when working with under-resourced languages, many of which are found in the developing world. We provide a brief overview of related data collection strategies, highlighting some of the salient issues pertaining to collecting ASR data for under-resourced languages. We then describe the development of a smartphone-based data collection tool, Woefzela, which is designed to function in a developing world context. Specifically, this tool is designed to function without any Internet connectivity, while remaining portable and allowing for the collection of multiple sessions in parallel; it also simplifies the data collection process by providing process support to various role players during the data collection process, and performs on-device quality control in order to maximise the use of recording opportunities. The use of the tool is demonstrated as part of a South African data collection project, during which almost 800 hours of ASR data was collected, often in remote, rural areas, and subsequently used to successfully build acoustic models for eleven languages. The on-device quality control mechanism (referred to as QC-on-the-go) is an interesting aspect of the Woefzela tool and we discuss this functionality in more detail. We experiment with different uses of quality control information, and evaluate the impact of these on ASR accuracy. Woefzela was developed for the Android Operating System and is freely available for use on Android smartphones.


international conference on digital forensics | 2007

Specializing CRISP-DM for evidence mining

Jacobus D. Venter; Alta de Waal; Cornelius J. Willers

Forensic analysis requires a keen detective mind, but the human mind has neither the ability nor the time to process the millions of bytes on a typical computer hard disk. Digital forensic investigators need powerful tools that can automate many of the analysis tasks that are currently being performed manually.


south african institute of computer scientists and information technologists | 2006

Named entity recognition in a South African context

Anita Louis; Alta de Waal; Cobus Venter

The feasibility of a probabilistic Named Entity Recognition system in a South African context was tested. The intended use of the system is in a cyber forensic domain. At the core of the system is a dynamic Bayesian Network, which takes into account the probabilistic relationship between variables as well as contextual information. We illustrate the performance of such a system using different probability thresholds for classification purposes and compare the performance with and without a name gazetteer. Our system compares competently with similar existing systems in the information extraction domain. Future work will involve the application of the system in the cyber forensic environment, which poses new challenges such as diverse text types.


international conference on digital forensics | 2008

Applying Topic Modeling to Forensic Data

Alta de Waal; Jacobus Venter; Etienne Barnard

Most actionable evidence is identified during the analysis phase of digital forensic investigations. Currently, the analysis phase uses expression-based searches, which assume a good understanding of the evidence; but latent evidence cannot be found using such methods. Knowledge discovery and data mining (KDD) techniques can significantly enhance the analysis process. A promising KDD technique is topic modeling, which infers the underlying semantic context of text and summarizes the text using topics described by words. This paper investigates the application of topic modeling to forensic data and its ability to contribute to the analysis phase. Also, it highlights the challenges that forensic data poses to topic modeling algorithms and reports on the lessons learned from a case study.


Technologies for Optical Countermeasures IX | 2012

Pyradi: an open-source toolkit for infrared calculation and data processing

Cornelius J. Willers; Maria S. Willers; Ricardo Tavares Santos; Petrus J. van der Merwe; Johannes J. Calitz; Alta de Waal; Azwitamisi E. Mudau

Electro-optical system design, data analysis and modeling involve a significant amount of calculation and processing. Many of these calculations are of a repetitive and general nature, suitable for including in a generic toolkit. The availability of such a toolkit facilitates and increases productivity during subsequent tool development: “develop once and use many times”. The concept of an extendible toolkit lends itself naturally to the open-source philosophy, where the toolkit user-base develops the capability cooperatively, for mutual benefit. This paper covers the underlying philosophy to the toolkit development, brief descriptions and examples of the various tools and an overview of the electro-optical toolkit. The toolkit is an extendable, integrated collection of basic functions, code modules, documentation, example templates, tests and resources, that can be applied towards diverse calculations in the electro-optics domain. The toolkit covers (1) models of physical concepts (e.g. Planck’s Law), (2) mathematical operations (e.g. spectral integrals, spatial integrals, convolution, 3-D noise calculation), (3) data manipulation (e.g. file input/output, interpolation, normalisation), and (4) graphical visualisation (2-D and 3-D graphs). Toolkits are often written in scriptable languages, such as Python and Matlab. This specific toolkit is implemented in Python and its associated modules Numpy, SciPy, Matlplotlib, Mayavi, and PyQt/PySide. In recent years these tools have stabilized and matured sufficiently to support mainstream tool development. Collectively, these tools provide a very powerful capability, even beyond the confines of this toolkit alone. Furthermore, these tools are freely available. Rudimentary radiometric theory is given in the paper to support the examples given. Examples of the toolkit use, as described in the paper, include (1) spectral radiometric calculations of arbitrary source-medium-sensor configurations, (2) spectral convolution processing, (3) 3-D noise analysis, (4) loading of ASCII text files, binary files, Modtran tape7 and FLIR Inc *.ptw files, (5) data visualization in 2-D and 3-D graphs and plots, (6) detector modeling from detail design parameters (bulk material detectors), (7) color coordinate calculations, and (8) various utility functions. The toolkit is developed as a cooperative effort between the CSIR, Denel SOC and DCTA. The project, available on Google Code at http://code.google.com/p/pyradi, is managed in accordance with general practice in the open source community.


Spie Newsroom | 2012

The pyradi radiometry toolkit

Cornelius J. Willers; Johannes J. Calitz; Alta de Waal; Azwitamisi E. Mudau; Maria S. Willers; Pieter van der Merwe; Ricardo Tavares Santos

Modeling and designing in electro-optical systems entails the calculation of several (often interrelated) parameters. Many of these calculations are repetitive, suitable for including in a generic toolkit. A well-designed kit would facilitate work flow and increase productivity during the modeling and design process. The concept of an extendable toolkit lends itself naturally to the open-source philosophy, where users cooperatively develop new tools to add to an ever-expanding set for the mutual benefit of all. The pyradi toolkit is an extendable, integrated and coherent collection of basic functions that can be applied towards diverse calculations in the electro-optics domain.1 The name pyradi is derived from the combination of ‘Python’ and ‘Radiometry.’ We initially considered two candidate languages for pyradi, MATLAB and Python. MATLAB has a strong following in the scientific community. In recent years Python and associated modules have beenwell tested, and have stabilized andmatured sufficiently to support mainstream tool development. But after extensive use of both languages for radiometric calculation and modeling, we decided to continue only with Python. With this application and its features in mind, Python provides better capability as a general purpose language, and its data visualization tools, Matplotlib and Mayavi, are the most powerful available today. While the wider range of toolboxes (including Simulink) might be a compelling reason to use MATLAB, those capabilities are not required here. We have already designed a number of pyradi modules covering basic electro-optical system calculations. For instance, the ryplanck module provides functions for Planck Law emittance calculations, as well as the Planck Law temperature derivative functions. Given the temperature and spectral vector, the functions provide spectral emittance in W/(m2 ) or q/(s m2 ), with spectral variable * in wavelength, Figure 1. Example polar plots: note axes conventions and yellow/red highlight of negative values.


Ecological Modelling | 2010

Modelling cheetah relocation success in southern Africa using an Iterative Bayesian Network Development Cycle

Sandra Johnson; Kerrie Mengersen; Alta de Waal; Kelly Marnewick; Deon Cilliers; Ann Marie Houser; Lorraine K. Boast


conference of the international speech communication association | 2011

Woefzela - an open-source platform for ASR data collection in the developing world

Nic J. de Vries; Jaco Badenhorst; Marelie H. Davel; Etienne Barnard; Alta de Waal


SLTU | 2012

Quality measurements for mobile data collection in the developing world

Jaco Badenhorst; Alta de Waal; Febe de Wet


Information Technologies and International Development | 2010

Morphological analysis: a method for selecting ICT applications in South African government service delivery

Madelaine Plauché; Alta de Waal; Aditi Sharma Grover; Tebogo Gumede

Collaboration


Dive into the Alta de Waal's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaco Badenhorst

Council of Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Charl Johannes van Heerden

Council of Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Cornelius J. Willers

Council for Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Azwitamisi E. Mudau

Council for Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Febe de Wet

Council of Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Johannes J. Calitz

Council for Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Aditi Sharma Grover

Council of Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar

Nic J. de Vries

Council of Scientific and Industrial Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge