Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where L.F.M. ten Bosch is active.

Publication


Featured researches published by L.F.M. ten Bosch.


ieee international conference on cognitive informatics | 2007

ACORNS - towards computational modeling of communication and recognition skills

L.W.J. Boves; L.F.M. ten Bosch; Roger K. Moore

In this paper the FP6 Future and Emerging Technologies project ACORNS is introduced. This project aims at simulating embodied language learning, inspired by the memory-prediction theory of intelligence. ACORNS intends to build a full computational implementation of sensory information processing. ACORNS considers linguistic units as emergent patterns. Thus, the research will not only address the issues conventionally investigated in statistical pattern recognition, but also the representations that are formed in memory. The paper discusses details of the memory and processing architecture that will be implemented in ACORNS, and explains how this architecture merges the basic concepts of the Memory-Prediction theory with results form previous research in the field of memory.


Journal of the Acoustical Society of America | 2003

Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

Odette Scharenborg; L.F.M. ten Bosch; L.W.J. Boves; Dennis Norris

~Received 10 December 2002; accepted for publication 25 August 2003!This letter evaluates potential benefits of combining human speech recognition~HSR! and automaticspeech recognition by building a joint model of an automatic phone recognizer ~APR! and acomputational model of HSR, viz., Shortlist @Norris, Cognition 52, 189–234 ~1994!#. Experimentsbased on ‘‘real-life’’ speech highlight critical limitations posed by some of the simplifyingassumptions made in models of human speech recognition. These limitations could be overcome byavoiding hard phone decisions at the output side of theAPR, and by using a match between the inputand the internal lexicon that flexibly copes with deviations from canonical phonemicrepresentations.


Corpus Linguistics and Linguistic Theory | 2013

Choosing alternatives: Using Bayesian Networks and memory based learning to study the dative alternation

D.L. Theijssen; L.F.M. ten Bosch; Bert Cranen; H. van Halteren; Lou Boves

Abstract In existing research on syntactic alternations such as the dative alternation, (give her the apple vs. give the apple to her), the linguistic data is often analysed with the help of logistic regression models. In this article, we evaluate the use of logistic regression for this type of research, and present two different approaches: Bayesian Networks and Memory-based learning. For the Bayesian Network, we use the higher-level semantic features suggested in the literature, while we limit ourselves to lexical items in the memory-based approach. We evaluate the suitability of the three approaches by applying them to a large data set (>11,000 instances) extracted from the British National Corpus, and comparing their quality in terms of classification accuracy, their interpretability in the context of linguistic research, and their actual classification of individual cases. Our main finding is that the classifications are very similar across the three approaches, also when employing lexical items instead of the higher-level features, because most of the alternation is determined by the verb and the length of the two objects (here: her and the apple).


international conference on spoken language processing | 1996

Integration of context-dependent durational knowledge into HMM-based speech recognition

X. Wang; L.F.M. ten Bosch; L.C.W. Pols

The paper presents research on integrating context-dependent durational knowledge into HMM-based speech recognition. The first part of the paper presents work on obtaining relations between the parameters of the context-free HMMs and their durational behaviour, in preparation for the context-dependent durational modelling presented in the second part. Duration integration is realised via rescoring in the post-processing step of the N-best monophone recogniser. The authors use the multi-speaker TIMIT database for the analyses.


international conference on spoken language processing | 1996

Analysis of context-dependent segmental duration for automatic speech recognition

X. Wang; L.C.W. Pols; L.F.M. ten Bosch

The paper presents statistical analyses of context dependent phone durations using the hand segmented TIMIT database, for the purpose of improving automatic speech recognition. Two main approaches were used. (1) Duration distributions were found under the influence of individual contextual factors, such as broader classes specified by long or short vowels, word stress, syllable position within the word and within an utterance, post vocalic consonants, and utterance speaking rate. (2) A hierarchically structured analysis of variance was used to study the numerical contributions of 11 different contextual factors to the variation in duration. Several systematic effects were found, whereas several others were obscured by the inherent variability in this speech material. We suggest implementation of this knowledge in the post processing phase of a recogniser.


Journal of the Acoustical Society of America | 1993

On the automatic classification of pitch movements

L.F.M. ten Bosch

In this paper, we discuss the construction of an algorithm that classifies pitch movements according to the IPO intonation system. We use a pitch stylization technique in order to obtain a continuous pitch contour over time, and a multi-linear classifier for the actual classification. In speaker-independent tests on a corpus of speech read by non-professionals, up to 81 % of the 279 pitch movements in the test corpus were correctly classified. These results are obtained by using information from the sampled speech data files only; a grammar will be used in the second stage of this study.


international conference on acoustics, speech, and signal processing | 1995

Automatic classification of pitch movements via MLP-based estimation of class probabilities

L.F.M. ten Bosch

In this paper, we study to what extent pitch movements in utterances can be classified automatically by using acoustical information and an intonation grammar. It will be shown that pitch movements can be classified into six categories with an agreement of about 80 percent compared with human transcriptions, on the basis of the pitch contour and the moments of vowel onsets. These six categories cover about 90 percent of all pitch movements used in the database (elicited speech). Results involving an intonation grammar are also presented.In this paper, we study to what extent pitch movements in utterances can be classified automatically by using acoustical information and an intonation grammar. It will be shown that pitch movements can be classified into six categories with an agreement of about 80 percent compared with human transcriptions, on the basis of the pitch contour and the moments of vowel onsets. These six categories cover about 90 percent of all pitch movements used in the database (elicited speech). Results involving an intonation grammar are also presented.


Journal of the Acoustical Society of America | 1993

Duration modeling with hidden Markov models

L.F.M. ten Bosch; X. Wang; L.C.W. Pols

In hidden Markov modeling (HMM) of speech signals, the statistics of speech characteristics are represented by HMM parameters after the HMM training. This procedure is purely statistical. This study concerns the incorporation of explicit knowledge into the HMM training. Therefore one specific parameter, i.e., segment duration, was selected. In order to study the relation between duration and HMM modeling, three types of duration PDFs (DPDFs) are distinguished: (A) the DPDF defined by the segmented database used (the actual duration histogram); (B) the DPDF defined by the trained Markov model (i.e., by the transition matrix), and (C) the DPDF based on the HMM segmentation. While PDF (A) is based on data and PDF (B) is based on the trained model, PDF (C) combines both features and is based on the available set of observation sequences. First, an explicit relation is formulated between topology of the PLU, the three DPDFs, and the so‐called Pade expansion. By using the generating function of the PDPT, it is ...


conference of the international speech communication association | 2016

Analytical assessment of dual-stream merging for noise-robust ASR

L.F.M. ten Bosch; Bert Cranen; Yang Sun

In previous studies (on Aurora2), it was found that merging a posteriori probability streams from different classifiers (GMM, MLP, Sparse Coding) can improve the noise robustness of ASR. Maximizing word accuracy required the stream weights to be systematically dependent on the specific input streams and SNR. The tuning of the weights, however, was largely a matter of trial and error and typically involved a laborious grid search. In this paper, we propose two fundamental, analytical methods to better understand these empirical findings. To that end, we maximize the trustworthiness of merged streams as function of the stream weights. Trustworthiness is defined as the probability that the winning state in a probability vector correctly predicts a golden reference state obtained by a forced alignment. Even though our approach is not directly equivalent to optimizing word accuracy, both methods appear highly useful to obtain insight in stream properties that determine the success of a given merge (or the lack thereof). Furthermore, both methods clearly support the trends that exist in the grid-search based empirical observations.


Linguistics | 2007

Early Decision Making in Continuous Speech

Odette Scharenborg; L.F.M. ten Bosch; L.W.J. Boves

In everyday life, speech is all around us, on the radio, television, and in human-human interaction. Communication using speech is easy. Of course, in order to communicate via speech, speech recognition is essential. Most theories of human speech recognition (HSR; Gaskell and Marslen-Wilson, 1997; Luce et al., 2000; McClelland and Elman, 1986; Norris, 1994) assume that human listeners first map the incoming acoustic signal onto prelexical representations (e.g., in the form of phonemes or features) and that these resulting discrete symbolic representations are then matched against corresponding symbolic representations of the words in an internal lexicon. Psycholinguistic experiments have shown that listeners are able to recognise (long and frequent) words reliably even before the corresponding acoustic signal is complete (Marslen-Wilson, 1987). According to theories of HSR, listeners compute a word activation measure (indicating the extent to which a word is activated based on the speech signal and the context) as the speech comes in and can make a decision as soon as the activation of a word is high enough, possibly before all acoustic information of the word is available (Marslen-Wilson, 1987; Marslen-Wilson and Tyler, 1980; Radeau et al., 2000). The “reliable identification of spoken words, in utterance contexts, before sufficient acoustic-phonetic information has become available to allow correct identification on that basis alone” is referred to as early selection by Marslen-Wilson (1987). In general terms, automatic speech recognition (ASR) systems operate in a way not unlike human speech recognition. However there are two major differences between human and automatic speech recognition. First of all, most mainstream ASR systems avoid an explicit representation of the prelexical level to prevent premature decisions that may incur irrecoverable errors. More importantly, ASR systems postpone final decisions about the identity of the recognised word (sequence) as long as possible, i.e., until additional input data can no longer affect the hypotheses. This too is done in order to avoid premature decisions, the results of which may affect the recognition of following words. In more technical terms: ASR systems use an integrated search inspired by basic Bayesian decision theory and aimed at avoiding decisions that must be revoked due to additional evidence. The competition between words in human speech recognition, on the other hand, is not necessarily always fully open; under some conditions an educated guess is made about the identity of the word being spoken, followed by a shallow verification. This means that the winning word might be chosen before the offset of the acoustic realisation of the word, thus while other viable competing paths are still available. Apparently, humans are willing to take risks that cannot be justified by Bayesian decision theory.

Collaboration


Dive into the L.F.M. ten Bosch's collaboration.

Top Co-Authors

Avatar

L.W.J. Boves

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Bert Cranen

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Lou Boves

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

L.C.W. Pols

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jort F. Gemmeke

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Heyun Huang

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Louis Vuurpijl

Nijmegen Institute for Cognition and Information

View shared research outputs
Top Co-Authors

Avatar

X. Wang

University of Amsterdam

View shared research outputs
Researchain Logo
Decentralizing Knowledge