Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bernd Möbius is active.

Publication


Featured researches published by Bernd Möbius.


Speech Communication | 2001

Developments and paradigms in intonation research

Antonis Botinis; Björn Granström; Bernd Möbius

Abstract The present tutorial paper is addressed to a wide audience with different discipline backgrounds as well as variable expertise on intonation. The paper is structured into five sections. In Section 1 , “ Introduction ”, basic concepts of intonation and prosody are summarised and cornerstones of intonation research are highlighted. In Section 2 , “ Functions and forms of intonation ”, a wide range of functions from morpholexical and phrase levels to discourse and dialogue levels are discussed and forms of intonation with examples from different languages are presented. In Section 3 , “ Modelling and labelling of intonation ”, established models of intonation as well as labelling systems are presented. In Section 4 , “ Applications of intonation ”, the most widespread applications of intonation and especially technological ones are presented and methodological issues are discussed. In Section 5 , “ Research perspective ” research avenues and ultimate goals as well as the significance and benefits of intonation research in the upcoming years are outlined.


SSW | 2001

Rare Events and Closed Domains: Two Delicate Concepts in Speech Synthesis

Bernd Möbius

One of the most serious challenges for speech synthesis is the systematic treatment of events in language and speech that are known to have low frequencies of occurrence. The problems that extremely unbalanced frequency distributions pose for rule-based or data-driven models are often underestimated or even unrecognized. This paper discusses the problems pertinent to rare events in four components of speech synthesis systems: in linguistic text analysis, where productive word formation processes generate a potentially unbounded lexicon and cause heavily skewed word frequency distributions; in syllabification, where some syllables occur very frequently but most phonotactically possible syllables are very infrequent; in speech timing, where most constellations of factors affecting segmental duration are sparsely or not at all represented in training databases; and in unit selection synthesis, where the uneven distribution of speech unit frequencies poses challenges to speech corpus design. Currently available techniques for coping with the problem of rare or unseen events in each of these components are reviewed. Finally, a distinction is made between a strictly closed domain with a fixed vocabulary and a merely restricted domain with loopholes for unseen words and names, and the consequences of the respective type of domain for appropriate synthesis strategies are discussed.


international conference on spoken language processing | 1996

Modeling segmental duration in German text-to-speech synthesis

Bernd Möbius; J. Von Santen

The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.


Speech Communication | 1993

Analysis and synthesis of German F 0 contours by means of Fujisaki's model

Bernd Möbius; Matthias Pätzold; Wolfgang Hess

Abstract This paper presents the adaptation of Fujisakis quantitative model to the analysis of German intonation and its application to F0 synthesis by rule. The parameter values of the model are determined by an automatic approximation of naturally produced F0 contours. The algorithm is not primarily based on mathematical criteria but is subject to constraints that emerge from a linguistic interpretation of the model. The potential sources of variation of the parameter values are examined using statistical methods. A set of rules is formulated that capture the effects of both linguistic and speaker-dependent features. The rules generate artificial intonation contours which in turn can be related to linguistic features such as sentence mode or word accent. Acceptability of the rule-generated intonation patterns as well as the adequate modelling of linguistic prosodic properties are evaluated perceptually by both phonetically trained subjects and prosodically “naive” listeners. In general, utterances resynthesized with rule-generated F0 contours are judged highly acceptable and natural by both groups of listeners. Detailed judgements with respect to word accent and sentence mode are obtained that help to improve several specific rules and contribute to a more adequate description of German intonation.


Cognitive Science | 2010

Multilevel Exemplar Theory

Michael Walsh; Bernd Möbius; Travis Wade; Hinrich Schütze

This paper presents recent research that provides an overarching model of exemplar theory capable of explaining phenomena across the phonetic and syntactic strata. The model represents a unique exemplar-based account of constituency interactions encompassing both linguistic domains. It yields simulation and experimental results in keeping with experimental findings in the literature on syllable duration variability and offers an exemplar-theoretic account of local grammaticality. In addition, it provides some insights into the nature of exemplar cloud formation and demonstrates experimentally the potential gains that can be enjoyed via the use of rich exemplar representations.


IEEE Transactions on Speech and Audio Processing | 2005

Formant tracking using context-dependent phonemic information

Minkyu Lee; J. van Santen; Bernd Möbius; J. Olive

A new formant-tracking algorithm using phoneme information is proposed. Conventional formant-tracking algorithms obtain formant tracks by analyzing the acoustic speech signal using continuity constraints without any additional information. The formant-tracking error rate of the conventional methods is reportedly in the range of 10%-20%. In this paper, we show that if text or phoneme transcription of speech utterances is available, the error rate can be significantly reduced. The basic idea behind this approach is that given the phoneme identity, formant-tracking algorithms can have a better clue of where to look for formants. The algorithm consists of three phases: 1) analysis, 2) segmentation and alignment, and 3) formant tracking by the Viterbi searching algorithm. In the analysis phase, formant candidates are obtained for each analysis frame by solving the linear prediction polynomial. In the segmentation and alignment phase, the text corresponding to the input speech utterance is converted into a sequence of phoneme symbols. Then, the phoneme sequence is time aligned with the speech utterance. A hidden Markov model (HMM) based automatic segmentation algorithm is used for forced-time alignment. For each phoneme segment, nominal formant frequencies are assigned at the center of each phoneme segment. Then nominal formant tracks for the entire utterance are obtained by interpolating the nominal formant frequencies. In order to compensate for the coarticulation effect, different interpolation methods are used depending on the phonemic context. The interpolation process makes the formant-tracking algorithm robust to possible segmentation errors made by the HMM-based segmentation algorithm. As a result, the proposed formant-tracking algorithm does not require highly accurate alignment/segmentation. Finally, a set of formants is chosen from the formant candidates in such a way that the resulting formant tracks come close to the nominal formant tracks while satisfying the continuity constraints. The algorithm is tested using natural speech utterances and the performance is compared against formant tracks obtained by the conventional method using continuity constraints only. The new algorithm significantly reduces the formant-tracking error rate (5.03% for male and 3.73% for female) over the conventional formant-tracking algorithm (13.00% for male and 15.82% for female).


Folia Linguistica | 2004

Corpus-Based Investigations on the Phonetics of Consonant Voicing

Bernd Möbius

Within and across languages the realization of consonant voicing is highly variable. This study aims to identify, and quantify, the segmental, prosodic and positional factors that have an influence on consonant voicing. A widely used acoustic measure of voicing, viz. voice onset time, is known to have disadvantages both in a cross-linguistic framework, where it fails to provide sufficient information for certain stop consonant classifications, and across consonant classes because it is not defined for fricatives and sonorants. This study applies the voicing profile method to the analysis of voicing properties of consonants in German. The voicing profile is defined as the frame-by-frame voicing status of speech sound realizations in a speech corpus. The speech database was judiciously constructed to cover systematically all possible speech sound combinations in German and a number of positional and prosodic contexts in which these combinations occur. The results are put in a cross-linguistic perspective by comparing the voicing profiles of German stops to those of stops in three other languages, viz. Mandarin Chinese, Hindi, and Mexican Spanish. The results are also discussed in the context of the production and maintenance of voicing during speech production. The voicing profile analysis is intended to serve as a methodology for investigating the discrepancies between the phonemic voicing specification of a speech sound and its phonetic realization in connected speech.


Archive | 2000

A Quantitative Model of Fo Generation and Alignment

Jan P. H. van Santen; Bernd Möbius

Local pitch contours belonging to the same perceptual or phonological class vary significantly as a result of the structure (i.e., the segments and their durations) of the syllables they are associated with. For example, in nuclear rise-fall pitch accents in declaratives, peak location (measured from stressed syllable start) can vary systematically between 150 and 300 ms as a function of the durations of the associated segments (van Santen & Hirschberg, 1994). Yet, there are temporal changes in local pitch contours that are phonologically significant even though their magnitudes do not appear to be larger than changes due to segmental effects (e.g., KOhler, 1990; D’Imperio & House, 1997).


meeting of the association for computational linguistics | 2000

Inducing probabilistic syllable classes using multivariate clustering

Karin Müller; Bernd Möbius; Detlef Prescher

An approach to automatic detection of syllable structure is presented. We demonstrate a novel application of EM-based clustering to multivariate data, exemplified by the induction of 3- and 5-dimensional probabilistic syllable classes. The qualitative evaluation shows that the method yields phonologically meaningful syllable classes. We then propose a novel approach to grapheme-to-phoneme conversion and show that syllable structure represents valuable information for pronunciation systems.


conference of the international speech communication association | 2005

Prosodic Models, Automatic Speech Understanding, and Speech Synthesis: Towards the Common Ground?

Anton Batliner; Bernd Möbius

Automatic speech understanding and speech synthesis, two major speech processing applications, impose strikingly different constraints and requirements on prosodic models. The prevalent models of prosody and intonation fail to offer a unified solution to these conflicting constraints. As a consequence, prosodic models have been applied only occasionally in end-to-end automatic speech understanding systems; in contrast, they have been applied extensively in speech synthesis systems. In this chapter we aim to make explicit the reasons for this state of affairs by reviewing the role of prosodic modelling in these two fields of speech technology. Subsequently, possible strategies to overcome the shortcomings of the use of prosodic modelling in automatic speech processing are discussed. In particular, the question is raised whether or not there is a common framework for prosodic modelling in automatic speech understanding and speech synthesis systems, and if so, whether any particular model or theory of prosody can serve as a common ground. Finally, a catalogue of tasks in prosody research is proposed that ought to be relevant to both automatic speech understanding and speech synthesis and that might stimulate joint research activities.

Collaboration


Dive into the Bernd Möbius's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hinrich Schütze

Ludwig Maximilian University of Munich

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge