Robert I. Damper | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Robert I. Damper is active.

Explore More

Publication

Featured researches published by Robert I. Damper.

Computational Linguistics | 2000

A multistrategy approach to improving pronunciation by analogy

Yannick Marchand; Robert I. Damper

Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included full pattern matching between input letter string and dictionary entries, as well as including lexical stress in letter-to-phoneme conversion. Second, we have extended the method to phoneme-to-letter conversion. Third, and most important, we have experimented with multiple, different strategies for scoring the candidate pronunciations. Individual scores for each strategy are obtained on the basis of rank and either multiplied or summed to produce a final, overall score. Five strategies have been studied and results obtained from all 31 possible combinations. The two combination methods perform comparably, with the product rule only very marginally superior to the sum rule. Nonparametric statistical analysis reveals that performance improves as more strategies are included in the combination: this trend is very highly significant (p < 0:0005). Accordingly for letter-to-phoneme conversion, best results are obtained when all five strategies are combined: word accuracy is raised to 65.5 relative to 61.7 for our best previous result and 63.0 for the best-performing single strategy. These improvements are very highly significant (p 0 and p < 0:00011 respectively). Similar results were found for phoneme-to-letter and letter-to-stress conversion, although the former was an easier problem for PbA than letter-to-phoneme conversion and the latter was harder. The main sources of error for the multistrategy approach are very similar to those for the best single strategy, and mostly involve vowel letters and phonemes.

IEEE Transactions on Neural Networks | 1993

Determining and improving the fault tolerance of multilayer perceptrons in a pattern-recognition application

Martin D. Emmerson; Robert I. Damper

We investigate empirically the performance under damage conditions of single- and multilayer perceptrons (MLPs), with various numbers of hidden units, in a representative pattern-recognition task. While some degree of graceful degradation was observed, the single-layer perceptron was considerably less fault tolerant than any of the multilayer perceptrons, including one with fewer adjustable weights. Our initial hypothesis that fault tolerance would be significantly improved for multilayer nets with larger numbers of hidden units proved incorrect. Indeed, there appeared to be a liability to having excess hidden units. A simple technique (called augmentation) is described, which was successful in translating excess hidden units into improved fault tolerance. Finally, our results were supported by applying singular value decomposition (SVD) analysis to the MLPs internal representations.

Computer Speech & Language | 1999

Evaluating the pronunciation component of text-to-speech systems for English: a performance comparison of different approaches

Robert I. Damper; Yannick Marchand; Martin J. Adamson; Kjell Gustafson

The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling “novel” words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large pronouncing dictionary, have received considerable attention recently but are generally thought to perform less well. However, these tentative beliefs are at best uncertain without powerful methods for comparing text-to-phoneme subsystems. This paper contributes to the development of such methods by comparing the performance of four representative approaches to automatic phonemization on the same test dictionary. As well as rule-based approaches, three data-driven techniques are evaluated: pronunciation by analogy (PbA), NETspeak and IB1-IG (a modified k-nearest neighbour method). Issues involved in comparative evaluation are detailed and elucidated. The data-driven techniques outperform rules in accuracy of letter-to-phoneme translation by a very significant margin but require aligned text-phoneme training data and are slower. Best translation results are obtained with PbA at approximately 72% words correct on a resonably large pronouncing dictionary, compared with something like 26% words correct for the rules, indicating that automatic pronunciation of text is not a solved problem. c 1999 Academic Press

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Handwritten Chinese radical recognition using nonlinear active shape models

Daming Shi; Steve R. Gunn; Robert I. Damper

Handwritten Chinese characters can be recognized by first extracting the basic shapes (radicals) of which they are composed. Radicals are described by nonlinear active shape models and optimal parameters found using the chamfer distance transform and a dynamic tunneling algorithm. The radical recognition rate is 96.5 percent correct (writer-independent) on 280,000 characters containing 98 radical classes.

Language and Speech | 1997

Pronunciation by analogy: Impact of implementational choices on performance

Robert I. Damper; J. F. G. Eastmond

Pronunciation by analogy (PbA) is an emerging, data-driven technique with potential application in text-to-speech (TTS) systems, as well as being an influential psychological model of reading aloud. The underlying idea is that a pronunciation for an unknown word (i.e., one not in the dictionary, or lexicon, of the human or machine “reader”) is assembled by matching substrings of the input to substrings of known, lexical words, hypothesizing a partial pronunciation for each matched substring from the lexical knowledge of the “reader,” and concatenating the partial pronunciations. This paper assesses the capability of PbA to derive pronunciations for unknown words of English. As a psychological model, PbA is “under-specified,” that is, the implementor of a simulation of the process faces detailed choices which can only be resolved by trial and error. One goal for this paper is to explore the impact of certain basic implementational choices on the performance of PbA systems. The variables studied are the specific lexical database used as the basis of the analogy process, the way of ranking/scoring candidate pronunciations, and the effect of manual versus automatic alignment of letters and phonemes. When tested with short (monosyllabic) pseudowords previously used in experimental psychology studies, the lowest error rate achieved is 14.3% (for a test set of size 70). We conclude that current PbA systems are at best poor models of pseudoword pronunciation by humans. To assess their suitability for use in a TTS application, in which multisyllabic words will be encountered, the implementations have also been tested with lexical words temporarily removed from the dictionary. The best performance obtained was 93.5% phonemes correct (corresponding to 67.9% words correct) for a 16,280-word dictionary. This is vastly superior to the 25.7% words correct obtained using a set of popular letter-to-sound rules, indicating considerable scope for analogy methods to be exploited in future TTS systems.

Robotics and Autonomous Systems | 2000

ARBIB: an Autonomous Robot Based on Inspirations from Biology

Robert I. Damper; R. L. B. French; T.W. Scutt

Abstract Simple artificial creatures (‘animats’), which operate as autonomous, adaptive robots in the real world, can serve both as models of biology and as a radical alternative to conventional methods of designing intelligent systems. We describe the evolution and implementation of the autonomous robot ARBIB, which learns from and adapts to its environment. A primary goal was to test the notion that effective robot learning can be based on neural habituation and sensitization, so validating the suggestion of Hawkins and Kandel that (associative) classical and ‘higher-order’ conditioning might be based on an elaboration of these (non-associative) forms of learning. Accordingly, ARBIB’s ‘nervous system’ has a non-homogeneous population of spiking neurons, and learning is by modification of basic, pre-existing (‘hard-wired’) reflexes. By monitoring firing rates of specific neurons and synaptic weights between neural connections as ARBIB learns from its environment, we confirm that both classical and higher-order conditioning occur, leading to the emergence of interesting and ecologically valid behaviors.

IEEE Transactions on Audio, Speech, and Language Processing | 2013

On Acoustic Emotion Recognition: Compensating for Covariate Shift

Ali Hassan; Robert I. Damper; Mahesan Niranjan

Pattern recognition tasks often face the situation that training data are not fully representative of test data. This problem is well-recognized in speech recognition, where methods like cepstral mean normalization (CMN), vocal tract length normalization (VTLN) and maximum likelihood linear regression (MLLR) are used to compensate for channel and speaker differences. Speech emotion recognition (SER) is an important emerging field in human-computer interaction and faces the same data shift problems, a fact which has been generally overlooked in this domain. In this paper, we show that compensating for channel and speaker differences can give significant improvements in SER by modelling these differences as a covariate shift. We employ three algorithms from the domain of transfer learning that apply importance weights (IWs) within a support vector machine classifier to reduce the effects of covariate shift. We test these methods on the FAU Aibo Emotion Corpus, which was used in the Interspeech 2009 Emotion Challenge. It consists of two separate parts recorded independently at different schools; hence the two parts exhibit covariate shift. Results show that the IW methods outperform combined CMN and VTLN and significantly improve on the baseline performance of the Challenge. The best of the three methods also improves significantly on the winning contribution to the Challenge.

International Journal of Systems Science | 2000

Towards the evolutionary emergence of increasingly complex advantageous behaviours

Alastair Channon; Robert I. Damper

The generation of complex entities with advantageous behaviours beyond our manual design capability requires long-term incremental evolution with continuing emergence. In this paper, we argue that artificial selection models, such as traditional genetic algorithms, are fundamentally inadequate for this goal. Existing natural selection systems are evaluated, revealing both significant achievements and pitfalls. Thus, some requirements for the perpetuation of evolutionary emergence are established. An (artificial) environment containing simple virtual autonomous organisms with neural controllers has been created to satisfy these requirements and to aid in the development of an accompanying theory of evolutionary emergence. Resulting behaviours are reported alongside their neural correlates. In a particular example, the collective behaviour of one species provides a selective force which is overcome by another species, demonstrating the incremental evolutionary emergence of advantageous behaviours via naturally arising coevolution. While the results fall short of the ultimate goal, experience with the system has provided some useful lessons for the perpetuation of emergence towards increasingly complex advantageous behaviours.

Natural Language Engineering | 2007

Can syllabification improve pronunciation by analogy of English

Yannick Marchand; Robert I. Damper

In spite of difficulty in defining the syllable unequivocally, and controversy over its role in theories of spoken and written language processing, the syllable is a potentially useful unit in several practical tasks which arise in computational linguistics and speech technology. For instance, syllable structure might embody valuable information for building word models in automatic speech recognition, and concatenative speech synthesis might use syllables or demisyllables as basic units. In this paper, we first present an algorithm for determining syllable boundaries in the orthographic form of unknown words that works by analogical reasoning from a database or corpus of known syllabifications. We call this syllabification by analogy (SbA). It is similarly motivated to our existing pronunciation by analogy (PbA) which predicts pronunciations for unknown words (specified by their spellings) by inference from a dictionary of known word spellings and corresponding pronunciations. We show that including perfect (according to the corpus) syllable boundary information in the orthographic input can dramatically improve the performance of pronunciation by analogy of English words, but such information would not be available to a practical system. So we next investigate combining automatically-inferred syllabification and pronunciation in two different ways: the series model in which syllabification is followed sequentially by pronunciation generation; and the parallel model in which syllabification and pronunciation are simultaneously inferred. Unfortunately, neither improves performance over PbA without syllabification. Possible reasons for this failure are explored via an analysis of syllabification and pronunciation errors.

Robotics and Autonomous Systems | 2004

Adaptive neurofuzzy control of a robotic gripper with on-line machine learning

Jorge Axel Domínguez-López; Robert I. Damper; Richard M. Crowder; Chris J. Harris

Abstract Pre-programming complex robotic systems to operate in unstructured environments is extremely difficult because of the programmer’s inability to predict future operating conditions in the face of unforeseen environmental conditions, mechanical wear of parts, etc. The solution to this problem is for the robot controller to learn on-line about its own capabilities and limitations when interacting with its environment. At the present state of technology, this poses a challenge to existing machine learning methods. We study this problem using a simple two-fingered gripper which learns to grasp an object with appropriate force, without slip while minimising chances of damage to the object. Three machine learning methods are used to produce a neurofuzzy controller for the gripper. These are off-line supervised neurofuzzy learning and two on-line methods, namely unsupervised reinforcement learning and an unsupervised/supervised hybrid. With the two on-line methods, we demonstrate that the controller can learn through interaction with its environment to overcome simulated failure of its sensors. Further, the hybrid is shown to out perform reinforcement learning alone in terms of faster adaptation to the changing circumstances of sensor failure. The hybrid learning scheme allows us to make best use of such pre-labeled datasets as might exist and to remember effectively good control actions discovered by reinforcement learning.

Explore More